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Series Editors’ Foreword 


This is the 15th volume of ‘Springer Series on Touch and Haptic Systems’, which is 
published as a collaboration between Springer and the EuroHaptics Society. 

Musical Haptics explores haptic interaction during the auditory experience of 
music and the combination of auditory and haptic information during instrumental 
performance. Auditory and haptic channels receive vibrations during instrument 
performance. This multimodal interaction is analysed from the points of view of 
both the audience and the musicians. 

Organized into two parts and 13 chapters, the first part is devoted to the fun- 
damentals of haptic interaction and perception of musical cues and part two shows 
examples in haptic musical interfaces. A glossary of terms at the end that explicitly 
defines specific terminology is also included. 

A successful workshop on Musical Haptics at the EuroHaptics 2016 conference 
in London led to the writing of this book. The editors have created an excellent 
compilation of the work introduced during the workshop and added new material to 
produce a cutting-edge volume. Moreover, this publication is the first open access 
issue in this Springer series which represents an eagerly anticipated development 
for our community. 


January 2018 Manuel Ferre 
Marc O. Ernst 
Alan Wing 


Preface 


The two fields of haptics and music are naturally connected in a number of ways. 
As a matter of fact, sound is nothing more than the auditory manifestation of 
vibration. When attending a concert, we are reached not only by airborne acoustic 
waves but also by related vibratory cues conveyed through the air and solid media 
such as the floor and seats. Moving from the audience to the performance stage, it is 
thanks to a complex system of auditory—haptic interactions established between 
musicians and their instruments that the former can render subtle expressive 
nuances and develop virtuosic playing techniques, and that being at a concert is 
such a rewarding experience. 

Whereas auditory research has since long addressed the musical scenario, 
research on haptics has only recently started to consider it. This volume aims to fill 
this gap by collecting for the first time state-of-the-art contributions from distin- 
guished scholars and young researchers working at the intersection of haptics and 
music performance. It presents theoretical, empirical, and practical aspects of haptic 
musical interaction and perception, such as the role of haptics in music performance 
and fruition, and describes the design and evaluation of digital musical interfaces 
that provide haptic feedback. 

The realization of this volume was originally encouraged by Prof. Manuel Ferre, 
following the successful organization of a scientific workshop on Musical Haptics 
by Stefano Papetti at the EuroHaptics 2016 conference. The workshop hosted some 
of the most renowned world experts in the field and fostered discussion, exchange, 
and collaboration to help address theoretical and empirical challenges in Musical 
Haptics research. It was, in a way, the crowning event of the project Audio-Haptic 
modalities in Musical Interfaces" (2014-2016), an interdisciplinary research funded 
by the Swiss National Science Foundation, which initiated an exploratory investi- 
gation on the role of haptics and the sense of touch in music practice. 


‘http://p3.snf.ch/project-150107 (last accessed on Nov 27, 2017). 
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The present volume primarily features contributions from presenters at the 
EuroHaptics workshop. Additional authors were invited based on their established 
activities and recent outstanding results. Mirroring the implicitly interdisciplinary 
nature of Musical Haptics, contributions come from a variety of scientific back- 
grounds, such as music composition and performance, acoustics, mechanical 
engineering, robotics, sound and music computing, music perception, and cognitive 
neuroscience, thus bringing diverse viewpoints on a number of common topics. 

Following an introduction which sets out the scope, aims, and relevance of 
Musical Haptics, the volume comprises 12 contributed chapters divided into two 
parts. Part I examines the relevance of haptic cues in music performance and 
perception, discussing how they affect user experience and performance in terms of 
usability, functionality, and perceived quality of musical instruments. Part II pre- 
sents engineering, computational, and design approaches and guidelines that have 
been applied to render and exploit haptic feedback in digital musical interfaces. The 
two parts are distinct yet complementary: studying the perception of haptics 
requires sophisticated rendering techniques; developing sophisticated rendering 
techniques for haptics requires a good understanding of its psychophysics. To help 
the reader, a glossary is included that gathers in one place explanations of concepts 
and tools recurring throughout the book. 

Musical Haptics is intended for haptic engineers, researchers in human—com- 
puter interaction, music psychologists, interaction designers, musical instrument 
designers, and musicians who, for example, would like to gain insight into the 
haptic exchange between musicians and their instruments, its relevance for user 
experience, quality perception and musical performance, as well as practical 
guidelines for the use of haptic feedback in musical devices and other human- 
computer interfaces. It is hoped that the present volume will contribute towards a 
scientific foundation of haptic musical interfaces, even though not all aspects have 
been possible to take into account. 

We thank the Institute for Computer Music and Sound Technology (ICST) at the 
Zurich University of the Arts (ZHdK) for funding the publication of the present 
volume in Open Access form, along with the Alexander von Humboldt Foundation 
for supporting C.S. through a Humboldt Research Fellowship. We are especially 
grateful to ICST Director German Toro-Peréz for his continuous support, as well as 
to Federico Avanzini and Federico Fontana for their precious organizational advice. 
Finally, we would like to thank all the authors for their valuable contribution to this 
book. 


Zurich, Switzerland Stefano Papetti 
Berlin, Germany Charalampos Saitis 
December 2017 


Contents 


1 Musical Haptics: Introduction.........................004. 
Stefano Papetti and Charalampos Saitis 


Part I Musical Haptics: Interaction and Perception 


2 Once More, with Feeling: Revisiting the Role of Touch in 
Performer-Instrument Interaction.......................0.. 
Sile O’Modhrain and R. Brent Gillespie 


3 A Brief Overview of the Human Somatosensory System......... 
Vincent Hayward 


4 Perception of Vibrotactile Cues in Musical Performance ........ 
Federico Fontana, Stefano Papetti, Hanna Järveläinen, 
Federico Avanzini and Bruno L. Giordano 


5 The Role of Haptic Cues in Musical Instrument Quality 
Perception. oi oce FS od ode lad eee hoe ada ee 
Charalampos Saitis, Hanna Järveläinen and Claudia Fritz 


6 A Functional Analysis of Haptic Feedback in Digital Musical 
Instrument Interactions..............0.0 20.0000. eee eee 
Gareth W. Young, David Murphy and Jeffrey Weeter 


7  Auditory-Tactile Experience of Music....................... 
Sebastian Merchel and M. Ercan Altinsoy 


Part II Haptic Musical Interfaces: Design and Applications 


8 The MSCI Platform: A Framework for the Design and 
Simulation of Multisensory Virtual Musical Instruments ........ 
James Leonard, Nicolas Castagné, Claude Cadoz and Annie Luciani 


10 


11 


12 


13 


Glossary and Abbreviations 


Force-Feedback Instruments for the Laptop Orchestra of 


Louisiana e teaa A eee Be Si sed te educa we eile, edd 


Edgar Berdahl, Andrew Pfalz, Michael Blandino 
and Stephen David Beck 


Design of Vibrotactile Feedback and Stimulation for Music 


Performance si teed eo a he AER dd ets es 


Marcello Giordano, John Sullivan and Marcelo M. Wanderley 
Haptics for the Development of Fundamental Rhythm Skills, 


Including Multi-limb Coordination...................... 


Simon Holland, Anders Bouwer and Oliver Hédl 


Touchscreens and Musical Interaction ................... 


M. Ercan Altinsoy and Sebastian Merchel 


Implementation and Characterization of Vibrotactile 


Tnterfaces sce ss feos Soe gs cass aa E aE eta sot a fete ates ® 


Stefano Papetti, Martin Fröhlich, Federico Fontana, 
Sébastien Schiesser and Federico Avanzini 


Contents 


Contributors 


M. Ercan Altinsoy Institut fiir Akustik und Sprachkommunikation, Technische 
Universität Dresden, Dresden, Germany 


Federico Avanzini Dipartimento di Informatica, Universita di Milano, Milano, 
Italy 


Stephen David Beck School of Music & CCT—Center for Computation and 
Technology, Louisiana State University, Baton Rouge, LA, USA 


Edgar Berdahl School of Music & CCT—Center for Computation and 
Technology, Louisiana State University, Baton Rouge, LA, USA 


Michael Blandino School of Music & CCT—Center for Computation and 
Technology, Louisiana State University, Baton Rouge, LA, USA 


Anders Bouwer Faculty of Digital Media and Creative Industries, Amsterdam 
University of Applied Sciences, Amsterdam, The Netherlands 


Claude Cadoz ACROE—Association pour la Création et la Recherche sur les 
Outils d’Expression & Laboratoire [CA—Ingénierie de la Création Artistique, 
Institut polytechnique de Grenoble, Université Grenoble Alpes, Grenoble, France 


Nicolas Castagné Laboratoire [CA—Ingénierie de la Création Artistique, Institut 
polytechnique de Grenoble, Université Grenoble Alpes, Grenoble, France 


Federico Fontana Dipartimento di Scienze Matematiche, Informatiche e Fisiche, 
Universita di Udine, Udine, Italy 


Claudia Fritz Equipe LAM—Lutheries-Acoustique-Musique, Institut Jean le 
Rond d’Alembert UMR 7190, Université Pierre et Marie Curie - CNRS, Paris, 
France 


Martin Fröhlich I[CST—Institute for Computer Music and Sound Technology, 
Ziircher Hochschule der Kiinste, Zurich, Switzerland 


xiii 


xiv Contributors 


R. Brent Gillespie Mechanical Engineering, University of Michigan, Ann Arbor, 
MI, USA 


Bruno L. Giordano Institut de Neurosciences de la Timone UMR 7289, 
Aix-Marseille Université-Centre National de la Recherche Scientifique, Marseille, 
France 


Marcello Giordano IDMIL—Input Devices and Music Interaction Laboratory, 
CIRMMT—Centre for Interdisciplinary Research in Music Media and Technology, 
McGill University, Montréal, QC, Canada 


Vincent Hayward Sorbonne Universités, Université Pierre et Marie Curie, Institut 
des Systèmes Intelligents et de Robotique, Paris, France 


Oliver Hédl Cooperative Systems Research Group, Faculty of Computer Science, 
University of Vienna, Vienna, Austria 


Simon Holland Music Computing Lab, Centre for Research in Computing, The 
Open University, Milton Keynes, UK 


Hanna Järveläinen [CST—Institute for Computer Music and Sound Technology, 
Ziircher Hochschule der Kiinste, Zurich, Switzerland 


James Leonard Laboratoire [CA—Ingénierie de la Création Artistique, Institut 
polytechnique de Grenoble, Université Grenoble Alpes, Grenoble, France 


Annie Luciani ACROE—Association pour la Création et la Recherche sur les 
Outils d’Expression & Laboratoire ICA—Ingénierie de la Création Artistique, 
Institut polytechnique de Grenoble, Université Grenoble Alpes, Grenoble, France 


Sebastian Merchel Institut fiir Akustik und Sprachkommunikation, Technische 
Universität Dresden, Dresden, Germany 


David Murphy University College Cork, Cork, Ireland 


Sile O’Modhrain School of Information & School of Music, Theatre and Dance, 
University of Michigan, Ann Arbor, MI, USA 


Stefano Papetti ICST—Institute for Computer Music and Sound Technology, 
Zürcher Hochschule der Künste, Zurich, Switzerland 


Andrew Pfalz School of Music & CCT—Center for Computation and 
Technology, Louisiana State University, Baton Rouge, LA, USA 


Charalampos Saitis Audio Communication Group, Technische Universitat 
Berlin, Berlin, Germany 


Sébastien Schiesser [CST—Institute for Computer Music and Sound Technology, 
Zürcher Hochschule der Künste, Zurich, Switzerland 


John Sullivan IDMIL—Input Devices and Music Interaction Laboratory, 
CIRMMT—Centre for Interdisciplinary Research in Music Media and Technology, 
McGill University, Montréal, QC, Canada 


Contributors XV 


Marcelo M. Wanderley IDMIL—Input Devices and Music Interaction 
Laboratory, CIRMMT— Centre for Interdisciplinary Research in Music Media and 
Technology, McGill University, Montréal, QC, Canada 


Jeffrey Weeter University College Cork, Cork, Ireland 
Gareth W. Young University College Cork, Cork, Ireland 


Chapter 1 A) 
Musical Haptics: Introduction get 


Stefano Papetti and Charalampos Saitis 


Abstract This chapter introduces to the concept of musical haptics, its scope, aims, 
challenges, as well as its relevance and impact for general haptics and human- 
computer interaction. A brief summary of subsequent chapters is given. 


1.1 Scope and Goals 


Musical haptics is an emerging interdisciplinary field investigating touch and pro- 
prioception in music scenarios from the perspectives of haptic engineering, human- 
computer interaction (HCI), applied psychology, musical acoustics, aesthetics, and 
music performance. 

The goals of musical haptics research may be summarized as: (i) to understand 
the role of haptic interaction in music experience and instrumental performance, and 
(ii) to create new musical devices yielding meaningful haptic feedback. 


1.2 Haptic Cues in Music Practice and Fruition 


Whenever an acoustic or electroacoustic musical instrument produces sound, that 
comes from its vibrating components (e.g., the reed and air column in a clarinet, or 
the strings and soundboard of a piano). While performing on such instruments, the 
haptic channel is involved in a complex action—perception loop: The player physically 
interacts with the instrument, on the one hand, to generate sound by injecting energy in 
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the form of forces, velocities, and displacements (e.g., striking the keys of a keyboard, 
or bowing, plucking, and pressing the strings of a violin), and on the other hand 
receiving and perceiving the instrument’s physical response (e.g., the instrument’s 
body vibration, the kinematic of keys being depressed, the resistance and vibration 
of strings). One could therefore assume that the haptic channel supports performance 
control (e.g., timing, intonation) as well as expressivity (e.g., timbre, emotion). In 
particular, skilled performers are known to establish a very intimate, rich haptic 
exchange with their instruments, resulting in truly embodied interaction that is hard 
to find in other human-machine contexts. Through training-based learning of haptic 
cues and auditory-—tactile interactions, musicians develop highly precise auditory— 
motor skills [7, 28]. They then form a base of highly demanding users who expect 
top quality interaction (i.e., extensive control, consistent response, and maximum 
efficiency) with their instruments—tools that extends beyond mere performance goals 
to emotional and aesthetical outcomes. 

In addition to what described above, both the performers and the audience are 
reached by vibration conveyed through air and solid media such as the floor and the 
seats of a concert hall. Those vibratory cues may then contribute to the perception of 
music (e.g., its perceived quality) and of instrumental performance (e.g., in an ensem- 
ble, a player could be able to monitor others’ performances also through such cues). 

Music fruition and performance therefore present a well-defined framework in 
which to study basic psychophysical, perceptual, and biomechanical aspects of touch 
and proprioception, all of which may inform the design of novel haptic musical 
devices. There is now a growing body of scientific studies of music performance and 
perception from which to inform research in musical haptics, including topics and 
methods from the fields of psychophysics [19], biomechanics [11], music education 
[29], psycholinguistics [32], and artificial intelligence [20]. 


1.3 Musical Devices and Haptic Feedback 


While current digital musical instruments (DMIs) usually offer touch-mediated inter- 
action, they fall short of providing a natural physical experience to the performer. With 
afew exceptions, they lack haptic cues other than those intrinsically provided by their 
(passive) mechanics, if any (e.g., the kinematics of a digital piano keyboard)—in other 
words, their behavior is the same whether they are turned on or off. Such missing link 
between sound production and active haptic feedback, summed to the fact that even 
sophisticated sound synthesis cannot (yet?) compete with the complexity and liveli- 
ness of acoustically generated sound, generally makes the experience of performing 
on DMIs less rewarding and rich than playing traditional instruments. Try asking a 
professional pianist, especially a classically trained one, to play a digital piano and 
watch out! However, one could argue that establishing a rich haptic exchange between 
musicians and their digital tools would enhance performance control, expressivity, 
and user experience, while the music listening experience would be improved by 
conveying audio-related vibratory cues to the listener. Indeed, a recently renewed 


1 Musical Haptics: Introduction 3 


interest in advancing haptic interaction design for everyday intelligent interfaces— 
shared across the HCI and engineering communities, as well as the consumer elec- 
tronics industry—promotes the idea that haptics has the potential to greatly improve 
usability, engagement, learnability, and the overall experience of the user, moreover 
with minimal or no requirements for constant visual attention [15, 17]. For example, 
haptic feedback is already used to improve robotic control in surgical teleoperation 
[27] and to increase realism and immersion in virtual reality applications [30]. 

With regard to applications, haptic musical interfaces may provide feedback on 
the performance itself or on various musical processes (e.g., representing a score). In 
addition to enhancing performance control and expressivity, they have a high poten- 
tial as tools for music tuition, for providing guidance in (intrinsically noisy) large 
ensembles and remote performance scenarios, and for facilitating access to music 
practice and fruition for persons affected by somatosensory, visual, and even hearing 
impairments [6, 13, 21]. A notable example is: The virtuoso and profoundly deaf 
percussionist Evelyn Glennie explained her use of vibrotactile cues in musical per- 
formance, to the point of recognizing the pitch, based on where the vibrations are 
felt on her body [10]. A further potential application of programmable haptic feed- 
back in musical interfaces is to offer a way of prototyping the mechanical response 
of components found in traditional instruments (e.g., the kinematics and vibratory 
behavior of a piano keyboard), thus saving time and lowering production costs, as 
opposed to traditional hardware development. 

Some efforts were made in recent years to define a systematic approach for the 
design of haptic DMIs and to assess their utility [3, 9, 23]. Some of the developed 
prototypes simulate the haptic behavior of existing acoustic or electroacoustic instru- 
ments, while others implement new paradigms not necessarily linked to traditional 
instruments. Early examples of haptic musical interfaces consist in piano-like key- 
boards with computer-driven mechanical feedback for simulating touch responses of 
various keyboard instruments (e.g., harpsichord, organ, piano) [4, 8]. More recently, 
a haptic system using magneto-rheological technology was developed that could 
reproduce the dynamic behavior of piano keyboards [16]. A vibrotactile feedback 
system for open-air music controllers, based on an actuated ring or a feet stimulator, 
was proposed in [31]. Haptic DMIs inspired by traditional instruments (violin, wood- 
winds, monochord, and slide whistle) are described in [2, 18, 22]. In [26], actuators 
were used on acoustic and electroacoustic instruments to feed mechanical energy 
back and induce or dampen resonances. 

Only a few commercial examples of haptic musical devices are currently found. 
The Yamaha AvantGrand! series of digital pianos embed vibration transducers sim- 
ulating the effect of vibrating strings and soundboard, and pedal depression. The 
system can be turned on or off, and vibration intensity adjusted. The Ultrasonic 
Audio Syntact” is a midair musical interface that performs hand-gesture analysis by 
means of a camera, and provides tactile feedback at the hand through an array of 


"https://europe.yamaha.com/en/products/musical_instruments/pianos/avantgrand/ (last accessed 
on Dec 7, 2017). 


*http://www.ultrasonic-audio.com/products/syntact.html (last accessed on Dec 7, 2017). 
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ultrasonic transducers. The Soundbrenner Pulse? is a wearable vibrotactile metro- 
nome. The Loflet Basslet* and Subpac° are wearable low-frequency vibration trans- 
ducers (tactile subwoofers), respectively, in the form of a bracelet and a vest, whose 
goal is to enhance the music listening experience. 


1.4 Challenges 


Research in musical haptics faces several challenges, some of which are common to 
haptic engineering and HCI in general. 

From a technology viewpoint, the use of sensors and actuators can be especially 
problematic because haptic musical interfaces should generally be compact and unob- 
trusive (to allow for seamless interaction), efficient in terms of power (so they can be 
compatible with current consumer electronics industrial processes), and offer high 
fidelity/accuracy (to enable sensing subtle gestures and rendering complex haptic 
cues). Musical haptics would then gain from further developments in sensing and 
actuator technology in those directions. 

From the perspective of HCI and psychophysics, the details of how the haptic 
modality is actually involved and exploited while performing with traditional musical 
instruments or while listening to music are still largely unknown. More psychophys- 
ical evidence and behavioral evidence are needed to establish the biomechanics of 
touch and how haptic cues affect measurable performance parameters such as accu- 
racy in timing, intonation, and dynamics, as well as to better understand the role of 
vibration in idiosyncratic perceptions of sound/instrument quality by performers and 
music/sound aesthetics by listeners. 

What is more, haptic musical interfaces are interactive systems that require rigor- 
ous user experience evaluation to help define optimal configurations between percep- 
tual effects and limitations on the one hand, and technological solutions on the other 
[5, 12, 33]. Despite the fact that several evaluation frameworks have been proposed 
[14, 24, 34], the evaluation of digital musical devices and related user experience 
currently suffers from a lack of commonly accepted goals, criteria, and methods [1, 
25]. 


1.5 Outline 


The first part of the book presents theoretical and empirical work in musical haptics 
with particular emphasis on biomechanical, psychophysical, and behavioral aspects 
of music performance and music perception. Chapter 2 redefines, with an original 
perspective, the biomechanics of the musician—instrument interaction as a tight 


3http://www.soundbrenner.com (last accessed on Dec 23, 2017). 
4+https://lofelt.com/ (last accessed on Dec 7, 2017). 
Shttp://subpac.com/ (last accessed on Dec 23, 2017). 
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dynamic coupling, rather than the mere interaction of two separate entities. Chapter 3 
introduces basic concepts and functions related to the anatomy and physiology of the 
human somatosensory system with special focus on the perception of touch, pressure, 
vibration, and movement. Chapter 4 reports experiments investigating vibrotactile 
perception in finger-pressing tasks and while performing on the piano. Chapter 5 
examines the role of vibrotactile cues on the perception of sound/instrument quality 
from the perspective of the musician, based on recent psycholinguistic and psy- 
chophysical evidence from violin and piano studies. Chapter 6 reports an experiment 
that uses quantitative and qualitative HCI evaluation methods to assess how various 
types of haptic feedback on a DMI affect aspects of functionality, usability, and user 
experience. Chapter 7 considers a music listening scenario for different musical gen- 
res and tests how body vibrations—generated from the original audio signal using a 
variety of approaches—influence the musical experience of the listener. 

The second part of the volume presents design examples, applications, and eval- 
uations of haptic musical interfaces. Chapter 8 describes an advanced hardware— 
software system for real-time rendering of physically modeled virtual instruments 
that can be played with force feedback, and its use as a creative artistic tool. Chapter 
9 examines hardware and computing solutions for the development of haptic force- 
feedback DMIs through a case study of music compositions for the Laptop Orchestra 
of Louisiana. Chapter 10 proposes and evaluates the design of a taxonomy of vibro- 
tactile cues and a stimulation system consisting in wearable garments for providing 
information similar to a score during music performance. Chapter 11 reports a series 
of experiments investigating the design and evaluation of vibrotactile stimulation 
for learning rhythm skills of varying complexity, with a special emphasis on multi- 
limb coordination. Chapter 12 evaluates the use of touchscreen interfaces augmented 
with audio-driven vibrotactile cues in music production, focusing on performance, 
user experience, and the cross-modal effect of audio loudness on tactile intensity. 
Chapter 13 illustrates common vibrotactile actuators technology and provides three 
examples of audio-haptic interfaces iteratively designed through validation pro- 
cedures that tested their accuracy in measuring user gesture and in delivering 
vibrotactile cues. 

A glossary at the end of the book provides descriptions (including related abbre- 
viations) of concepts and tools that are frequently mentioned throughout the vol- 
ume, offering a useful background for those less acquainted with haptic and music 
technology. 
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Abstract The dynamical response of a musical instrument plays a vital role in 
determining its playability. This is because, for instruments where there is a phys- 
ical coupling between the sound-producing mechanism of the instrument and the 
player’s body (as with any acoustic instrument), energy can be exchanged across 
points of contact. Most instruments are strong enough to push back; they are springy, 
have inertia, and store and release energy on a scale that is appropriate and well 
matched to the player’s body. Haptic receptors embedded in skin, muscles, and 
joints are stimulated to relay force and motion signals to the player. We propose that 
the performer-instrument interaction is, in practice, a dynamic coupling between a 
mechanical system and a biomechanical instrumentalist. We take a stand on what 
is actually under the control of the musician, claiming it is not the instrument that 
is played, but the dynamic system formed by the instrument coupled to the musi- 
cian’s body. In this chapter, we suggest that the robustness, immediacy, and potential 
for virtuosity associated with acoustic instrument performance are derived, in no 
small measure, from the fact that such interactions engage both the active and pas- 
sive elements of the sensorimotor system and from the musician’s ability to learn 
to control and manage the dynamics of this coupled system. This, we suggest, is 
very different from an interaction with an instrument whose interface only supports 
information exchange. Finally, we suggest that a musical instrument interface that 
incorporates dynamic coupling likely supports the development of higher levels of 
skill and musical expressiveness. 
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2.1 Introduction 


The mechanics of a musical instrument’s interface—what the instrument feels 
like—determines a great deal of its playability. What the instrument provides to 
be held, manipulated by mouth or hand, or otherwise controlled has obvious but also 
many subtle implications for how it can be used for musical expression. One means 
to undertake an analysis of playability and interface mechanics is in terms of the 
mechanical energy that is exchanged between a player’s body and the instrument. 
For acoustic instruments, mechanical energy injected by the player is transformed 
into acoustic energy through a process of resonance excitation. For electronic instru- 
ments, electrical energy is generally transformed into acoustic energy through a 
speaker, but controlled by interactions involving the player’s body and some physi- 
cal portion of the instrument. 

Importantly, there exists the possibility for mechanical energy stored in the 
physical part of the instrument to be returned to the player’s body. This possibility 
exists for both acoustic and electronic instruments, though in acoustic instruments 
it is in fact a likelihood. This likelihood exists because most acoustic instruments 
are strong enough to push back; they are springy, have inertia, and store and return 
energy on a scale that is roughly matched to the scale at which the player’s body 
stores and returns energy. Given that energy storage and return in the player’s body 
is determined by passive elements in muscle and tissues, one can say that the scale 
at which interface elements of the instrument are springy and have mass is similar to 
the scale at which muscles and tissues of the player are springy and have mass. That 
is, the mechanics of most acoustic instruments are roughly impedance matched to 
the biomechanics of the player’s body. Impedance matching facilitates the exchange 
of energy between passive elements within the instrument and passive elements that 
are part of the biomechanics of the player. Thus the player’s joints are moved or 
backdriven by the instrument, muscle stiffness is loaded, and the inertial dynamics 
of body segments are excited. In turn, haptic receptors embedded in skin, muscles, 
and joints are stimulated and relay force and motion signals to the player. It is also 
no accident that the parts of the body that interact with instruments—lips, fingers, 
hands—are the most highly populated by haptic receptors. 

In this chapter, we propose that performer-instrument interaction is a dynamic 
coupling between a mechanical system and a biomechanical instrumentalist. This 
repositions the challenge of playing an instrument as a challenge of “playing” the 
coupled dynamics in which the body is already involved. We propose that inter- 
actions in which both the active and passive elements of the sensorimotor system 
(see Chap. 3) are engaged form a backdrop for musical creativity that is much more 
richly featured than the set of actions one might impose on an instrument considered 
in isolation from the player’s body. We further wish to propose that the robustness, 
immediacy, and potential for virtuosity associated with acoustic instrument perfor- 
mance are derived, in no small measure, from the fact that such interactions engage 
both the active and passive elements of the sensorimotor system and determine the 
musician’s ability to learn and manage the dynamics of this coupled system. This, 
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we suggest, is very different from an interaction with an electronic instrument whose 
interface is only designed to support information exchange. 

We also suggest that a musical instrument interface that incorporates dynamic 
coupling supports the development of higher levels of skill and musical expressive- 
ness. To elaborate these proposals concretely, we will adopt a modeling approach 
that explicitly considers the role of the musician’s body in the process of extract- 
ing behaviors from a musical instrument. We will describe the springiness, inertia, 
and damping in both the body and the instrument in an attempt to capture how an 
instrument becomes an extension of the instrumentalist’s body. And insofar that the 
body might be considered an integral part of the process of cognition, so too does 
an instrument become a part of the process of finding solutions to musical problems 
and producing expressions to musical ideas. 


2.2 A Musician Both Drives and Is Driven 
by Their Instrument 


The standard perspective on the mechanics of acoustic instruments holds that energy 
is transformed from the mechanical to the acoustic domain—mechanical energy 
passes from player to instrument and is transformed by the instrument, at least in 
part, to acoustic energy that emanates from the instrument into the air. Models that 
describe the process by which mechanical excitation produces an acoustic response 
have been invaluable for instrument design and manufacture and have played a central 
role in the development of sound synthesis techniques, including modal synthesis [1] 
and especially waveguide synthesis [2] and physical modeling synthesis algorithms 
[3-5]. The role of the player in such descriptions is to provide the excitation or to 
inject energy. Using this energy-based model, the question of “control,” or how the 
player extracts certain behaviors including acoustic responses from the instrument 
reduces to considering how the player modulates the amount and timing of energy 
injected. 

While an energy-based model provides a good starting point, we argue here that 
a musician does more than modulate the amount and timing of excitation. Elaborat- 
ing further on the process of converting mechanical into acoustic energy, we might 
consider that not all energy injected is converted into acoustic energy. A portion of 
the energy is dissipated in the process of conversion or in the mechanical action of 
the instrument and a portion might be reflected back to the player. As an example, 
in Fig. 2.1, we show that a portion of the energy injected into the piano action by the 
player at the key is converted to sound, another portion is dissipated, and yet another 
portion is returned back to the player at the mechanical contact. 

But a model that involves an injection of mechanical energy by the player does 
not imply that all energy passes continuously in one direction, nor even that the 
energy passing between player and instrument is under instantaneous control of 
the player. There might also exist energy exchanges between the player’s body and 
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Fig. 2.1 In response to energy injected at the key, the piano action reflects a portion, dissipates a 
portion, and converts another portion into output sound 


the instrument whose time course is instead governed by the coupling of mechanical 
energy storage elements in the player’s body and the instrument. Conceivably, energy 
may even oscillate back and forth between the player and instrument, as governed 
by the coupled dynamics. For example, multiple strikes of a drumstick on a snare 
drum are easily achieved with minimal and discrete muscle actions because potential 
energy may be stored and returned in not only the drumhead but also in the finger grip 
of the drummer. To drive these bounce oscillations, the drummer applies a sequence 
of discrete muscle actions at a much slower rate than the rate at which the drumstick 
bounces. Then to control the bounce oscillation rate, players modulate the stiffness 
of the joints in their hand and arm [6]. 

We see, then, that energy exchanges across a mechanical contact between musi- 
cian and instrument yield new insights into the manner in which a player extracts 
behavior from an acoustic instrument. Cadoz and Wanderly, in defining the func- 
tions of musical gesture, refer to this exchange of mechanical energy as the “ergotic” 
function, the function which requires the player to do work upon the instrument 
mechanism [7]. Chapter 8 describes a software—hardware platform that addresses 
such issue. We extend this description here to emphasize that the instrument is a 
system which, once excited, will also “do work” on the biomechanical system that 
is the body of the player. In particular, we shall identify passive elements in the 
biomechanics of the player’s body upon which the instrument can “do work” or 
within which energy returned from the instrument can be stored in the player’s body, 
without volitional neural control by the player’s brain. The drumming example elab- 
orated above already gives a flavor for this analysis. It is now important to consider 
the biomechanics of the player’s body. 

Note that relative to virtually all acoustic musical instruments, the human body 
has a certain give, or bends under load. Such bending under load occurs even when 
the body is engaged in manually controlling an instrument. In engineering terms, 
the human body is said to be backdrivable. And this backdrivability is part of the 
match in mechanical impedance between body and instrument. Simple observations 
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support this claim, such as excursions that take place at the hand without volitional 
control if the load from an instrument is unexpectedly applied or removed. Think 
for example of the sudden slip of the bowing hand when the bowstring interaction 
fails because of a lack of rosin [8]. It follows that significant power is exchanged 
between the player and instrument, even when the player is passive. Such power 
exchanges cannot be captured by representing the player as a motion source (an 
agent capable of specifying a motion trajectory without regard to the force required) 
or a force source (an agent capable of specifying a force trajectory without regard to 
the motion required). Because so much of the passive mechanics of the player’s body 
is involved, the contact between a human and machine turns out to hold disadvantages 
when it comes to dividing the human/machine system into manageable parts for the 
purposes of modeling. 

If good playability was to be equated with high control authority and the backdriv- 
able biomechanics ignored, then an instrument designer might maximize instrument 
admittance while representing the player as a motion source or maximize instrument 
impedance while representing the player as a force source. Indeed, this approach to 
instrument design has, on the one hand, produced the gestural control interface that 
provides no force feedback and, on the other hand, produced the touch screen that 
provides no motion feedback. But here we reject representations of the player as 
motion or force source and label approaches which equate playability with high con- 
trol authority as misdirected. We contend that the gestural control interface lacking 
force feedback and touch screen are failures of musical instrument interface design 
(Chap. 12 discusses the use of touch screen devices with tactile feedback for pattern- 
based music composition and mixing). We claim that increasing a player’s control 
authority does not amount to increasing the ability of the player to express their 
motor intent. Instead, the impedance of the instrument should be matched to that 
of the player, to maximize power transfer between player and machine and thereby 
increase the ability of the player to express their motor (or musical expression) intent. 
Our focus on motor intent and impedance rather than control authority amounts to a 
fundamental change for the field of human motor control and has significant implica- 
tions for the practice of designing musical instruments and other machines intended 
for human use. 


2.3 The Coupled Dynamics: A New Perspective on Control 


In this chapter, we are particularly interested in answering how a musician controls 
an instrument. To competently describe this process, our model must capture two 
energy-handling processes in addition to the process by which mechanical energy 
is converted into acoustic energy: First, how energy is handled by the instrument 
interface, and second, how it is handled by the player’s body. Thereafter, we will 
combine these models to arrive at a complete system model in which not only energy 
exchanges, but also information exchanges can be analyzed, and questions of playa- 
bility and control can be addressed. 
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For certain instruments, the interface mechanics have already been modeled to 
describe what the instrument feels like to the player. Examples include models that 
capture the touch response of the piano action [9, 10] and feel of the drum head [11]. 

To capture the biomechanics of the player, suitable models are available from 
many sources, though an appropriately reduced model may be a challenge to find. In 
part, we seek a model describing what the player’s body “feels like” to the instrument, 
the complement of a model that describes what the instrument feels like to the player. 
We aim to describe the mechanical response of the player’s body to mechanical exci- 
tation at the contact with the instrument. Models that are competent without being 
overly complex may be determined by empirical means, or by system identification. 
Hajian and Howe [12] determined the response of the fingertip to a pulse force and 
Hasser and Cutkosky determined the response of a thumb/forefinger pinch grip to a 
pulse torque delivered through a knob [13]. Both of these works proposed parametric 
models in place of non-parametric models, showing that simple second-order mod- 
els with mass, stiffness, and damping elements fit the data quite well. More detailed 
models are certainly available from the field of biomechanics, where characteriza- 
tions of the driving point impedance of various joints in the body can be helpful for 
determining state of health. Models that can claim an anatomical or physiological 
basis are desirable, but such models run the risk of contributing complexity that 
would complicate the treatment of questions of control and playability. 

Models that describe what the instrument and body feel like to each other are both 
models of driving-point impedance. They each describe relationships between force 
and velocity at the point of contact between player and instrument. The driving- 
point impedance of the instrument expresses the force response of the instrument 
to a velocity imposed by the player, and the driving-point impedance of the player 
expresses the force response of the player to a velocity imposed by the instrument. 
Of course, only one member of the pair can impose a force at the contact. The other 
subsystem must respond with velocity to the force imposed at the contact; thus, 
its model must be expressed as a driving-point admittance. This restriction as to 
which variable may be designated an input and which an output is called a causality 
restriction (see, e.g., [14]). The designation is an essentially arbitrary choice that 
must be made by the analyst. Let us choose to model the player as an admittance 
(imposing velocity at the contact) and the instrument as an impedance (imposing 
force at the contact). 

Driving-point impedance models that describe what the body or instrument feel 
like to each other provide most, but not all of what is needed to describe how a 
player controls an instrument. A link to muscle action in the player and a link to 
the process by which mechanical energy is converted into acoustic energy in the 
instrument are still required. In particular, our driving-point admittance model of the 
player must be elaborated with input/output models that account for the processing 
of neural and mechanical signals in muscle. In addition, our driving-point impedance 
model of the instrument must be elaborated with an input/output model that accounts 
for the excitation of a sound generation process. If our driving-point admittance and 
impedance models are lumped parameter models in terms of mechanical mass, spring, 
and damping elements, then we might expect the same parameters to appear in the 
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Player Instrument 


muscle 


Haptic/Mechanical 
Neural 
Acoustic 


Fig. 2.2 Musician and instrument may both be represented as multi-input, multi-output systems. 
Representing the instrument in this way, an operator G transforms mechanical excitation into 
mechanical response. An operator P transforms mechanical excitation into acoustic response. Rep- 
resenting the player, let H indicate the biomechanics of the player’s body that determines the 
mechanical response to a mechanical excitation. The motor output of the player also includes a 
process M, in which neural signals are converted into mechanical action. The response of muscle M 
to neural excitation combines with the response of H to excitation from the instrument to produce 
the action of the musician on the instrument. The brain produces neural activation of muscle by 
monitoring both haptic and acoustic sensation. Blue arrows indicate neural signaling and neural 
processing while red arrows indicate mechanical signals and green arrows indicate acoustic signals 


input/output models that we use to capture the effect of muscle action and the process 
of converting mechanical into acoustic energy. 

Let us represent the process inside the instrument that transforms mechanical input 
into mechanical response as an operator G (see Fig. 2.2). This is the driving-point 
impedance of the instrument. And let the process that transforms mechanical input 
into acoustic response be called P. Naturally, in an acoustic instrument both G and 
P are realized in mechanical components. In a digital musical instrument, P is often 
realized in software as an algorithm. In a motorized musical instrument, even G can 
be realized in part through software [15]. 

As described above, in P, there is generally a change in the frequency range that 
describes the input and output signals. The input signal, or excitation, occupies a 
low-frequency range, usually compatible with human motor action. The relatively 
high-frequency range of the output is determined in an acoustic instrument by a 
resonating instrument body or air column that is driven by the actions of the player 
on the instrument. Basically, motor actions of the player are converted into acoustic 
frequencies in the process P. On the other hand, G does not usually involve a change 
in frequency range. 

Boldly, we represent the musician as well, naming the processes (operators) that 
transform input to output inside the nervous system and body of the musician. 
Here we identify both neural and mechanical signals, and we identify processes that 
transform neural signals, processes that transform mechanical signals (called biome- 
chanics) and transducers that convert mechanical into neural signals (mechanorecep- 
tors and proprioceptors) and transducers that convert neural into mechanical signals 
(muscles). Sect. 3.3.1 provides a description of such mechanisms. Let us denote those 
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(b) 2 


Instrument 


Fig. 2.3 Instrument playing considered as a control design problem. a The musician, from the 
position of controller in a feedback loop, imposes their control actions on the instrument while 
monitoring the acoustic and haptic response of the instrument. b From the perspective of dynamic 
coupling, the “plant” upon which the musician imposes control actions is the system formed by the 
instrument and the musician’s own body (biomechanics) 


parts of the musician’s body that are passive or have only to do with biomechanics 
in the operator H. Biomechanics encompasses stiffness and damping in muscles and 
mass in bones and flesh. That is, biomechanics includes the capacity to store and 
return mechanical energy in either potential (stiffness) or kinetic (inertial) forms and 
to dissipate energy in damping elements. Naturally, there are other features in the 
human body that produce a mechanical response to a mechanical input that involve 
transducers (sensory organs and muscles) including reflex loops and sensorimotor 
loops. Sensorimotor loops generally engage the central nervous system and often 
some kind of cognitive or motor processing. These we have highlighted in Fig. 2.2 
as a neural input into the brain and as a motor command that the brain produces in 
response. We also show the brain as the basis for responding to an acoustic input with 
a neural command to muscle. Finally, we represent muscle as the operator M that 
converts neural excitation into a motor action. The ears transform acoustic energy 
into neural signals available for processing and the brain in turn generates muscle 
commands that incite the action of the musician on the instrument. Figure 2.3 also 
represents the action of the musician on the instrument as the combination of muscle 
actions through M and response to backdrive by the instrument through H. Note that 
the model in Fig. 2.3 makes certain assumptions about superposition, though not all 
operators need be linear. 

This complete model brings us into position to discuss questions in control, that is, 
how a musician extracts desired behaviors from an instrument. We are particularly 
interested in how the musician formulates a control action that elicits a desired 
behavior or musical response from an instrument. We will attempt to unravel the 
processes in the formulation of a control action, including processes that depend on 
immediately available sensory input (feedback control) and processes that rely on 
memory and learning (open-loop control). 

As will already be apparent, the acoustic response of an instrument is not the only 
signal available to the player as feedback. In addition, the haptic response functions 
as feedback, carrying valuable information about the behavior of the instrument 
and complementing the acoustic feedback. Naturally, the player, as controller in a 
feedback loop, can modify his or her actions on the instrument based on a comparison 
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of the desired sound and the actual sound coming from the instrument. But the player 
can also modify his or her actions based on a comparison of the feel of the instrument 
and a desired or expected feel. A music teacher quite often describes a desired feel 
from the instrument, encouraging a pupil to adjust actions on the instrument until 
such a mechanical response can be recognized in the haptic response. One of the 
premises of this volume is that this second, haptic, channel plays a vital role in 
determining the “playability” of an instrument, i.e., in providing a means for the 
player to “feel” how the instrument behaves in response to their actions. 

In the traditional formulation, the instrument is the system under control or the 
“plant” in the feedback control system (see Fig. 2.3a). As controller, the player aims 
to extract a certain behavior from the instrument by imposing actions and monitoring 
responses. But given that the haptic response impedes on the player across the same 
mechanical contact as the control action imposed by the player, an inner feedback 
loop is closed involving only mechanical variables. Neural signals and the brain of 
the instrument player are not involved. The mechanical contact and the associated 
inner feedback loop involve the two variables force and velocity whose product is 
power and is the basis for energy exchanges between player and instrument. That is, 
the force and motion variables that we identify at the mechanical contact between 
musician and instrument are special in that they transmit not only information but 
also mechanical energy. That energy may be expressed as the derivative of power, 
the product of force and velocity at the mechanical contact. As our model developed 
above highlights, a new dynamical system arises when the body’s biomechanics are 
coupled to the instrument mechanics. We shall call this new dynamical system the 
coupled dynamics. The inner feedback loop, which is synonymous with the coupled 
dynamics, is the new “plant” under control (see Fig. 2.3b). The outer feedback loop 
involves neural control and still has access to feedback in both haptic and audio 
channels. 

In considering the “control problem,” we see that the coupled dynamics is a dif- 
ferent system, possibly more complex, than the instrument by itself. Paradoxically, 
the musician’s brain is faced with a greater challenge when controlling the coupled 
dynamical system that includes the combined body and instrument dynamics. There 
are new degrees of freedom (DoF) to be managed—dynamic modes that involve 
exchanges of potential and kinetic energy between body and instrument. But some- 
thing unique takes place when the body and instrument dynamics are coupled. A 
feedback loop is closed and the instrument becomes an extension of the body. The 
instrument interface disappears and the player gains a new means to effect change 
in their environment. This sense of immediacy is certainly at play when a skilled 
musician performs on an acoustic instrument. 

But musical instruments are not generally designed by engineers. Rather, they 
are designed by craftsmen and musicians—and usually by way of many iterations 
of artistry and skill. Oftentimes that skill is handed down through generations in a 
process of apprenticeship that lacks engineering analysis altogether. Modern devices, 
on the other hand—those designed by engineers—might function as extensions of 
the brain, but not so much as extensions of the body. While there is no rule that 
says a device containing a microprocessor could not present a vanishingly small or 
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astronomically large mechanical impedance to its player, it can be said that digital 
instrument designers to date have been largely unaware of the alternatives. Is it 
possible to design a digital instrument whose operation profits from power exchanges 
with its human player? We aim to capture the success of devices designed through 
craftsmanship and apprenticeship in models and analyses and thereby inform the 
design of new instruments that feature digital processing and perhaps embedded 
control. 


2.4 Inner and Outer Loops in the Interaction Between 
Player and Instrument 


Our new perspective, in which the “plant” under control by the musician is the 
dynamical system determined conjointly by the biomechanics of the musician and 
the mechanics of the instrument, yields anew perspective on the process of controlling 
and learning to control an instrument. Consider for a moment, the superior access 
that the musician has to feedback from the dynamics of the coupled system relative 
to feedback from the instrument. The body is endowed with haptic sensors in the lips 
and fingertips, but also richly endowed with haptic and proprioceptive sensors in the 
muscles, skin, and joints. Motions of the body that are determined in part by muscle 
action but also in part by actions of the instrument on the body may easily be sensed. 
A comparison between such sensed signals and expected sensations, based on known 
commands to the muscles, provides the capability of estimating states internal to the 
instrument. See, for example, [16]. 

The haptic feedback thus available carries valuable information for the musician 
about the state of the instrument. The response might even suggest alternative actions 
or modes of interaction to the musician. For example, the feel of let-off in the piano 
action (after which the hammer is released) and the feel of the subsequent return 
of the hammer onto the repetition lever and key suggest the availability of a rapid 
repetition to the pianist. 

Let us consider cases in which the coupled dynamics provides the means to 
achieve oscillatory behaviors with characteristic frequencies that are outside the 
range of human volitional control. Every mechanical contact closes a feedback loop, 
and closing a feedback loop between two systems capable of storing and returning 
energy creates a new dynamic behavior. Speaking mechanically, if the new mode 
is underdamped, it would be called a new resonance or vibration mode. On the one 
hand, the force and motion variables support the exchange of mechanical energy; on 
the other hand, they create a feedback loop that is characterized by a resonance. Since 
we have identified a mechanical subsystem in both the musician and the instrument, 
it is noteworthy that these dynamics are potentially quite fast. There is no neural 
transmission nor cognitive processing that takes place in this pure mechanical loop. 

Given that neural conduction velocities and the speed of cognitive processes may 
be quite slow compared to the rates at which potential and kinetic energy can be 
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exchanged between two interconnected mechanical elements, certain behaviors in 
the musician-/instrument-coupled dynamics can be attributed to an inner loop, not 
involving closed-loop control by the musician’s nervous system. In particular, neural 
conduction delays and cognitive processing times on the order of 100 ms would 
preclude stable control of a lightly underdamped oscillator at more than about 5 Hz 
[17], yet rapid piano trills exceeding 10 Hz are often used in music [18]. The existence 
of compliance in the muscles of the finger and the rebound of the piano key are 
evidently involved in an inner loop, while muscle activation is likely the output of a 
feedforward control process. 

As we say, the musician is not playing the musical instrument but instead playing 
the coupled dynamics of his or her own body and instrument. Many instruments sup- 
port musical techniques which are quite evidently examples of the musician driving 
oscillations that arise from the coupled dynamics of body and instrument mechanics. 
For example, the spiccato technique in which a bow is “bounced” on a string involves 
driving oscillatory dynamics that arise from the exchange of kinetic and potential 
energy in the dynamics of the hand, the bow and hairs, and the strings. Similarly, 
the exchange of kinetic and potential energy underlies the existence of oscillatory 
dynamics in a drum roll, as described above. It is not necessary for the drummer 
to produce muscle action at the frequency of these oscillations, only to synchronize 
driving action to these oscillations [6]. 

The interesting question to be considered next is whether the perspective we have 
introduced here may have implications for the design of digital musical instruments: 
whether design principles might emerge that make a musical instrument an extension 
of the human body and a means for the musician to express their musical ideas. It 
is possible that answering such a question might also be the key to codifying certain 
emerging theories in the fields of human motor control and cognitive science. While 
it has long been appreciated that the best machine interface is one that “disappears” 
from consciousness, a theory to explain such phenomena has so far been lacking. 

The concept of dynamic coupling introduced here also suggests a means for a 
musician to learn to control an instrument. First, we observe that humans are very 
adept at controlling their bodies when not coupled to objects in the environment. 
Given that the new control challenge presented when the body is coupled to an 
instrument in part involves dynamics that were already learned, it can be said that 
the musician already has some experience even before picking up an instrument for 
the first time. Also, to borrow a term from robotics, the body is hyper-redundantly 
actuated and equipped with a multitude of sensors. From such a perspective, it makes 
sense to let the body be backdriven by the instrument, because only then do the 
redundant joints become engaged in controlling the instrument. 

An ideal musical instrument is a machine that extends the human body. From 
this perspective, it is the features in a musical instrument’s control interface that 
determine whether the instrument can express the player’s motor intent and support 
the development of manual skill. We propose that approaching questions of digital 
instrument design can be addressed by carefully considering the coupling between 
a neural system, biomechanical system, and instrument, and even the environment 
in which the musical performance involving the instrument takes place. Questions 
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can be informed by thinking carefully about a neural system that “knows” how to 
harness the mechanics of the body and object dynamics and a physical system that 
can “compute in hardware” in service of a solution to a motor problem. 

The human perceptual system is aligned not only to extracting structure from 
signals (or even pairs of signals) but to extract structure from pairs of signals known 
to be excitations and responses (inputs and outputs). What the perceptual system 
extracts in that case is what the psychologist J. J. Gibson refers to as “invariants” 
[19]. According to Gibson, our perceptual system is oriented not to the sensory field 
(which he terms the “ambient array”) but to the structure in the sensory field, the 
set of signals which are relevant in the pursuit of a specific goal. For example, in 
catching a ball, the “signal” of relevance is the size of the looming image on the 
retina and indeed the shape of that image; together these encode both the speed and 
angle of the approaching ball. Similarly, in controlling a drum roll, the signal of 
relevance is the rebound from the drumhead which must be sustained at a particular 
level to ensure an even roll. The important thing to note is that for the skilled player, 
there is no awareness of the proximal or bodily sensation of the signal. Instead, the 
external or “distal” object is taken to be the signal’s source. In classical control, such 
a structured signal is represented by its generator or a representation of a system 
known to generate such a structured signal. 

Consider for a moment, a musician who experiences a rapid oscillation-like behav- 
ior arising from the coupling of his or her own body and an instrument, perhaps the 
bounce of a bow on a string, or the availability of a rapid re-strike on a piano key due 
to the function of the repetition lever. Such an experience can generally be evoked 
again and again by the musician learning to harness such a behavior and develop 
it into a reliable technique, even if it is not quite reliable at first. The process of 
evoking the behavior, by timing one’s muscle actions, would almost certainly have 
something to do with driving the behavior, even while the behavior’s dynamics might 
involve rapid communication of energy between body and instrument as described 
above. Given that the behavior is invariant to the mechanical properties of body and 
instrument (insofar that those properties are constant) it seems quite plausible that 
the musician would develop a kind of internal description or internal model of the 
dynamics of the behavior. That internal model will likely also include the possibilities 
for driving the behavior and the associated sensitivities. 

In his pioneering work on human motor control, Nicolai Bernstein has described 
how the actions of a blacksmith are planned and executed in combination with knowl- 
edge of the dynamics of the hammer, workpiece, and anvil [20]. People who are 
highly skilled at wielding tools are able to decouple certain components of planned 
movements, thereby making available multiple “loops” or levels of control which 
they can “tighten” or “loosen” at will. In the drumming example cited above, we 
have seen that players can similarly control the impedance of their hand and arm to 
control the height of stick bounces (the speed of the drum roll), while independently 
controlling the overall movement amplitude (the loudness of the drum roll). 

Interestingly, the concept of an internal model has become very influential in 
the field of human motor behavior in recent years [21] and model-based control 
has become an important sub-discipline in control theory. There is therefore much 
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potential for research concerned with exploring the utility of model-based control 
for musical instruments, especially from the perspective that the model internalized 
by the musician is one that describes the mechanical interactions between his or 
her own body and the musical instrument. This chapter is but a first step in this 
direction. Before leaving the questions we have raised here, however, we will briefly 
turn our attention to how the musician might learn to manage such coupled dynamics, 
proposing that the robustness, immediacy, and potential for virtuosity associated with 
acoustic instrument performance is derived in large part from engaging interactions 
that involve both the active and passive elements of the sensorimotor system. 


2.5 Implications of a Coupled Dynamics Perspective 
on Learning to Play an Instrument 


At the outset of this chapter, we proposed that successful acoustic instruments are 
those which are well matched, in terms of their mechanical impedance, to the capa- 
bilities of our bodies. In other words, for an experienced musician, the amount of 
work they need to do to produce a desired sound is within a range that will not 
exhaust their muscles on the one hand but which will provide sufficient push-back 
to support control on the other. But what about the case for someone learning an 
instrument? What role does the dynamic behavior of the instrument play in the pro- 
cess of learning? Even if we do not play an instrument ourselves, we are probably all 
familiar with the torturous sound of someone learning to bow a violin, or with our 
own exhausting attempts to get a note out of a garden hose. This is what it sounds 
and feels like to struggle with the coupled dynamics of our bodies and an instrument 
whose dynamical behavior we have not yet mastered. And yet violins can be played, 
and hoses can produce notes, so the question is how does someone learn to master 
these behaviors? 

Musical instruments represent a very special class of objects. They are designed 
to be manipulated and to respond, through sound, to the finest nuances of movement. 
As examples of tools that require fine motor control, they are hard to beat. And, 
as with any tool requiring fine motor control, a musician must be sensitive to how 
the instrument responds to an alteration in applied action with the tiniest changes in 
sound and the tiniest changes in haptic feedback. Indeed, a large part of acquiring skill 
as a musician is being able to predict, for a given set of movements and responses, 
the sound that the instrument will make and to adjust movements, in anticipation or 
in real time, when these expectations are not met. 

The issue, as Bernstein points out, is that there are often many ways of achieving 
the same movement goal [20]. In terms of biomechanics, joints and muscles can 
be organized to achieve an infinite number of angles, velocities, and movement 
trajectories, while at the neurophysiological level, many motorneurons can synapse 
onto a single muscle and, conversely, many muscle fibers can be controlled by one 
motor unit (see Sect. 3.2 for more details concerning the hand). This results in a 
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biological system for movement coordination that is highly adaptive and that can 
support us in responding flexibly to perturbations in the environment. In addition, 
as Bernstein’s observations of blacksmiths wielding hammers demonstrated, our 
ability to reconfigure our bodies in response to the demands of a task goal extends to 
incorporating the dynamics of the wielded tool into planned movement trajectories 
[20, 22]. Indeed, it is precisely this ability to adapt our movements in response to 
the dynamics of both the task and the task environment that allow us to acquire new 
motor skills. 

Given this state of affairs, how do novice musicians (or indeed experienced musi- 
cians learning new pieces) select from all the possible ways of achieving the same 
musical outcome? According to Bernstein’s [20] theory of graded skill acquisition, 
early stages of skill acquisition are associated with “freezing” some biomechanical 
DoF (e.g., joint angles). Conversely, later (higher) stages are characterized by a more 
differentiated use of DoF (“freeing”), allowing more efficient and flexible/functional 
performance. This supposition aligns perfectly with experimental results in which 
persons adopted a high impedance during early stages of learning (perhaps removing 
DoF from the coupled dynamics) and transitioning to a lower impedance once the 
skill was mastered [23]. 

More recently, Ranganathan and Newell [24, 25] proposed that in understanding 
how and why learning could be transferred from one context to another, it was 
imperative to uncover the dynamics of the task being performed and to determine the 
“essential” and “non-essential” task variables. They define non-essential variables as 
the whole set of parameters available to the performer and suggest that modifications 
to these parameters lead to significant changes in task performance. For example, in 
throwing an object the initial angle and velocity would be considered non-essential 
variables, because changes to these values will lead to significant changes in the task 
outcome. The essential variables are a subset of the available working parameters 
that are bound together by a common function. In the case of throwing an object, 
this would be the function that relates the goal of this particular throwing task 
to the required throwing angle and velocity [26]. The challenge, as Pacheco and 
Newell point out, is that in many tasks this information is not immediately available. 
Therefore, the learner needs to engage in a process of discovery or “exploration” of 
the available dynamic behaviors to uncover, from the many possible motor solutions, 
which will be the most robust. But finding a motor solution is only the first step 
since learning will only occur when that movement pattern is stabilized through 
practice [27]. 

In contrast to exploration, stabilization is characterized as a process of making 
movement patterns repeatable, a process which Pacheco and Newell point out can 
be operationalized as a negative feedback loop, where both the non-essential and 
essential execution variables are corrected from trial to trial. Crucially, Pacheco and 
Newell determined that, for learning and transfer to be successful, the time spent in 
the exploration phase and the time spent in the stabilization phase must be roughly 
equal [26]. 

As yet, we have little direct evidence of these phases of learning of motor skill in 
the context of playing acoustic musical instruments. A study by Rodger et al., how- 
ever, suggests that exploration and stabilization phases of learning may be present as 
new musical skills are acquired. In a longitudinal study, they recorded the ancillary (or 
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non-functional) body movements of intermediate-level clarinetists before and after 
learning a new piece of music. Their results demonstrated that the temporal control of 
ancillary body movements made by participants was stronger in performances after 
the music had been learned and was closer to the measures of temporal control found 
for an expert musician’s movements [28]. While these findings provide evidence 
that the temporal control of musicians’ ancillary body movements stabilizes with 
musical learning, the lack of an easy way to measure the forces exchanged across the 
mechanical coupling between player and instrument means that we cannot yet empir- 
ically demonstrate the role that learning to manage the exchange of energy across 
this contact might play in supporting the exploration and stabilization of movements 
as skill is acquired. Indeed, the fact that haptic feedback plays a role for the musician 
in modeling an instrument’s behavior has already been demonstrated experimentally 
using simulated strings [29] and membranes [11, 30]. In both cases, performance of 
simple playing tasks was shown to be more accurate when a virtual haptic playing 
interface was present that modeled the touch response of the instrument (see also 
Chap. 6). 

As a final point, we suggest that interacting with a digital musical instrument 
that has simulated dynamical behavior is very different from interacting with an 
instrument with a digitally mediated playing interface that only supports information 
exchange. As an extreme example, while playing keyboard music on a touch screen 
might result in a performance that retains note and timing information, it is very 
difficult, if not impossible, for a player to perform at speed or to do so without 
constantly visually monitoring the position of their hands. Not only does the touch 
screen lack the mechanical properties of a keyboard instrument, it also lacks the 
incidental tactile cues such as the edges of keys and the differentiated height of black 
and white keys that are physical “anchors” available as confirmatory cues for the 
player. 

In summary, a musical instrument interface that incorporates dynamic coupling 
not only provides instantaneous access to a second channel of information about its 
state, but, because of the availability of cues that allow for the exploration and selec- 
tion of multiple parameters available for control of its state, such an interface is also 
likely to support the development of higher levels of skill and musical expressiveness. 


2.6 Conclusions 


In this chapter, we have placed particular focus on the idea that the passive dynamics 
of the body of a musician play an integral role in the process of making music 
through an instrument. Our thesis, namely that performer-instrument interaction is, 
in practice, a dynamic coupling between a mechanical system and a biomechanical 
instrumentalist, repositions the challenge of playing an instrument as a challenge 
of “playing” the coupled dynamics in which the body is already involved. The idea 
that an instrument becomes an extension of the player’s body is quite concrete when 
the coupled dynamics of instrument and player are made explicit in a model. From 
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a control engineering perspective, the body-/instrument-coupled dynamics form an 
inner feedback loop; the dynamics of this inner loop are to be driven by an outer loop 
encompassing the player’s central nervous system. This new perspective becomes a 
call to arms for the design of digital musical instruments. It places a focus on the 
haptic feedback available from an instrument, the role of energy storage and return in 
the mechanical dynamics of the instrument interface, and the possibilities for control 
of fast dynamic processes otherwise precluded by the use of feedback with loop 
delay. 

This perspective also provides a new scaffold for thought on learning and skill 
acquisition, as we have only briefly explored. When approached from this perspec- 
tive, skill acquisition is about refining control of one’s own body, as extended by the 
musical instrument through dynamic coupling. Increasing skill becomes a question 
of refining control or generalizing previously acquired skills. Thus, soft-assembly 
of skill can contribute to the understanding of learning to play instruments that 
express musical ideas. The open question remains: what role does the player’s per- 
ception of the coupled dynamics play in the process of becoming a skilled performer? 
Answering this question will require us to step inside the coupled dynamics of the 
player/instrument system. With the advent of new methods for on-body sensing of 
fine motor actions and new methods for embedding sensors in smart materials, the 
capacity to perform such observations is now within reach. 
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Chapter 3 A) 
A Brief Overview of the Human get 
Somatosensory System 


Vincent Hayward 


Abstract This chapter provides an overview of the human somatosensory system. It 
is the system that subserves our sense of touch, which is so essential to our awareness 
of the world and of our own bodies. Without it, we could not hold and manipulate 
objects dextrously and securely, let alone musical instruments, and we would not 
have a body that belongs to us. Tactile sensations, conscious or unconscious, arise 
from the contact of our skin with objects. It follows that the mechanics of the skin 
and of the hand its interaction with objects is the source of information that our 
brain uses to dextrously manipulate objects, as in music playing. This information 
is collected by vast array of mechanoreceptors that are sensitive to the effects of 
contacting objects, often with the fingers, even far away for the region of contact. 
This information is processed by neural circuits in numerous regions of the brain to 
provide us with extraordinary cognitive and manipulative functions that depend so 
fundamentally on somatosensation. 


3.1 Introduction 


The overarching purpose of the somatosensory system is to inform the brain of the 
mechanical state of the body that it inhabits. It shares this function with the vestibular 
system. But whereas the vestibular system operates in the low-dimensional space of 
head translations and rotations, the somatosensory system takes its input from almost 
the entire body. The main sources of information arise in part from the load-bearing 
structures represented by connective tissues such as tendons and ligaments, in part 
from the motion-producing tissues, the muscles, and in part from the outer layers of 
body, that is the skin. As a result, unlike the vestibular system, which is sensitive to 
the movements of a rigid body—the cranium—the somatosensory system relates to 
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mechanical domains that are in essence deformable bodies. This explains why, despite 
the fact that the two systems share the same overall task, they differ fundamentally. 
The vestibular inputs arise from small, easily identifiable organs in the inner ears, 
since it is the low-dimensional description of the movements of a rigid body that 
is of interest. In contrast, the somatosensory system relate to what is essentially an 
infinite dimensional solid (and liquid) domain and depends on the changes of its 
internal mechanical state to infer the properties of the objects that are being touched 
such as their weight, the substance they are made of, or the existence and nature of the 
relative movement of the body in relation to external objects [35, 74]. In other words, 
it is a distributed system in the physical sense that its mechanical state is described by 
(tensor) fields rather than vectorial quantities. This basic fact is of course reflected in 
its general organisation where very large populations of specific detectors are found in 
all load-bearing and load-producing tissues. That is not to say that the somatosensory 
system is unique in its reliance on large populations of sensors. This is also true of 
all sensory systems, including vision, audition, taste/olfaction and of the vestibular 
system. 

The haptic function depends on several systems of large organs. In an adult person, 
the skin’s mass can reach two kilograms and part of its functions is mechanosensing. 
However, it must be kept in mind that most of the body’s soft and connective tissues 
are mechanosensitive and associated with abundant innervation. The exact contribu- 
tions of the different mechanoreceptive channels to the formation of haptic percepts 
remain today to be established. 

Recent research has revealed a number of rather surprising findings. For exam- 
ple, most textbooks teach that the sense of limb’s relative position is mediated by 
mechanoreceptors embedded in the muscles. However, recent research has shown 
conclusively that the awareness of limb position is also mediated by sensory inputs 
arising from the skin [20, 21]. Alternatively, it is often assumed that the quality of 
the surfaces of objects is the exclusive result of cutaneous inputs. Recently, it is been 
shown that complete abolishment of distal cutaneous input, resulting from trauma or 
anaesthesia, had negligible effect on participants’ ability to discriminate the rough- 
ness of surfaces [53], which could be explained by the fact that friction-induced 
vibrations taking place at the fingertip propagate far inside the anatomy, at least up to 
the forearm [15], stimulating large populations of mechanoreceptors that might not 
be located in the skin and that can be quite remote from the locus of mechanical 
input [69]. 

These observations demonstrate that the study of the haptic function must be 
discussed from different perspectives where individual components should not be 
assigned one-to-one relationships, largely because the sensing organ, as alluded to 
in the previous paragraph, is by physical necessity distributed in the entire body and 
not even just at its surface. 
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3.2 Biomechanics of the Hand 


3.2.1 Hand Structural Organisation 


David Katz described the hand as a ‘unitary organ’ where the sensory and motor 
functions take place together [48]. The hand is not the only organ in the body that 
has this particularity. The foot is in many ways similar to the hand, but configured for 
locomotion rather than manipulation. Both organs possess an abundantly articulated 
skeletal structure held together by connective ligaments in the form of joint capsules 
and tendons that are connected to muscles located remotely in the forearm or the leg. 
In turn, these muscles insert in the arm and leg bones, and thus, a single tendon path 
can span up to four joints with the wrist and the three phalangeal joints. To give a 
sense of scale of the biomechanical complexity of the hand and the foot, it suffices to 
consider that phalanges receive four tendon insertions except for the distal phalanges 
that receive only two. Some tendons insert in several bones, and most tendons diverge 
and converge to form a mechanical network. The hand and the foot also have the so- 
called intrinsic muscles that insert directly into small bones, notably for the thumb, 
with some of these intrinsic muscles not inserting in any bones but in tendons only. 
Thus, if one considers bones, tendons and muscles to be individual elements, all 
connectivity options (one-to-one, one-to-several, several-to-one) are represented in 
the biomechanical structure of the hand, foot and limbs to which they are attached. 


3.2.2 Hand Mobility 


It is tempting to think of the hand as an articulated system of bodies connected 
with single-degree-of-freedom joints that guide their relative displacements. This 
simple picture is quite incorrect on two counts. The first is that skeletal joints are 
never ‘simple’ in the sense that they allow movements that ideal ‘lower pairs,’ such 
as simple hinges, would not. In biomechanics, one seldom ventures in quoting a 
precise number of degrees of freedom which, depending on the authors, can vary 
from 10 to more than 60 when speaking of the hand only. The biomechanical reality 
suggests that the kinematic mobility of the hand is simply the number of bones 
considered six times, but the actual functional mobility suggests that certain joint 
excursions have a much greater span than others. One could further argue that, save 
for nails, since the hand interacts with objects through soft tissues, its true mobility is 
infinite dimensional [35], a problem we shall return to when discussing the sensing 
capabilities of the hand. 

The most productive approach to make sense of this complexity is, counter- 
intuitively, to augment the complexity of the system analysed and to also include 
the sensorimotor neural control system in its description. In effect, the mechanics 
of the hand mean nothing without the considerable amount of neural tissue and 
attending sophisticated neural control that is associated with it. In this perspective, 
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the concept of ‘synergies’ was put forward long ago by the pioneers of the study 
of movement production and control (Joseph Babinski 1857—1932, Charles Scott 
Sherrington 1857-1952, Nikolai Bernstein 1896—1966, and others) and has received 
much study since. 

Loosely speaking, the idea behind this concept is that movements with a purpose— 
be it sensory, manipulative, locomotive or communicative—are highly organised. 
Each of these purposes is associated with the coordinated action of groups of mus- 
cles through time, but, importantly, the number of these purposes is small compared 
to the number of all possible movements. The purposes can include reaching, grasp- 
ing, feeling, drawing, stepping, pressing on keys, sliding on strings or plucking them, 
bending notes, and, crucially, they can be combined and chained together to yield 
complex behaviours orchestrated by the central nervous system. The entire senso- 
rimotor system, much of which is dedicated to the hand, is implemented following 
a hierarchical organisation with nuclei in the dorsal column, the brain stem, the 
midbrain, the cerebellum and ultimately several cortical regions. The considerable 
literature on the subject can be approached through recent books and surveys [10, 
51, 67]. 


3.2.3 The Volar Hand 


The inside region of the hand is named ‘volar’ by opposition to the ‘dorsal’ region. 
The volar region is of primary interest since it is the interface where most of the 
haptic interactions take place. Detecting a small object—say a sewing needle lying 
on a smooth surface—is absolutely immediate with the fingertip but more difficult 
with other volar hand regions, and the same object will go undetected by any other 
part of the body, including the dorsal hand region. It is also evident that the sensitive 
volar skin is mechanically very different of what is often called the ‘hairy skin’ 
covering the dorsal region. The most conspicuous feature is the presence of ridges, 
that is, of a clearly organised micro-geometry that is not seen elsewhere, except in 
the plantar region of the foot. In fact, the often called the ‘glabrous’ skin differs from 
the ‘hairy’ skin in four important properties. 


Pulp: The glabrous skin is never really close nor very far from a bone. In the 
fingertip and elsewhere in the hand, it is separated from the bone by a relatively 
uniform distance of 3 or 4 mm. The space in between is densely filled by a special 
type of connective tissue called the pulp [33]. This fibrous tissue is crucial to give 
the volar hand its manipulative and sensorial capabilities since a fingertip can take 
a load of several hundreds of Newtons without damage and simultaneously detect 
a needle. The pulp gives the skin the ability to conform with the touched object 
by enlarging the contact surface, which is mainly independent from the load past 
a certain value [68]. Incidentally, this simple fact makes it evident that the notion 
of ‘force’ or even of ‘pressure’ must be taken carefully when speaking of tactile 
sensory performance (see Sect. 4.2). 
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Ridges: The ridges are peculiarly unique to the volar hand and plantar foot. They 
long have been believed to have the mechanical purpose to increase friction and 
indeed are often called ‘friction ridges’. Recent findings have shown that quite 
the opposite is the case [80]. To understand why that is, one must consider basic 
notions in contact mechanics evoked in the next paragraph. The main point being 
that ridges actually diminish the net contact surface of volar surface against an 
object compared to a non-ridged surface. 

Stratum Corneum: The external skin layer, the stratum corneum, is made of ker- 
atin, which is a structural material arising from the death of skin cells. This mate- 
rial is mechanically akin to a polymer [61] and is capable of creating complex 
mechanical effect during sliding, even on optically smooth surfaces [16, 19, 83]. 

Sweat Glands: While the volar regions of the body cover only 5% of its surface, 
25% of all the 2 millions sweat glands are located there with a density reaching 
300 per cm? [57, 73]. 


3.2.4 Bulk Mechanics of the Fingertip and the Skin 


The glabrous skin covering the volar region of hand is, quite visibly, neither an 
isotropic nor a homogeneous medium. It is apparent that the ridges introduce pre- 
ferred directions that facilitate certain types of deformations. The effect of static 
punch indentation on the human fingertip can be made visible by imaging the shape 
of finger contact with a flat surface when a small object, such as a guitar string, is 
trapped at the interface, see Fig. 3.1. 

The detailed local properties of the ridged skin were investigated in vivo by Wang 
and Hayward [79] by loading approximately 0.5 mm? regions of skin. Unsurprisingly, 
the measurements revealed great anisotropy according to the ridge orientation when 
the skin is stimulated in traction, that is, in its natural mode of loading (see Fig. 3.2). 
On the other hand, the elastic properties of the ridged skin seem to be by-and-large 
immune of factors such as individuals and thickness of the stratum corneum. Detailed 
in vivo measurement can also be performed using optical coherence tomography 
(OCT) or elastography [24, 52], obtaining results similar to those found by direct 
mechanical stimulation. These findings point out how uncertain it is to predict the 
properties of tissues across length and timescales. The viscoelastic properties of the 
ridged skin are dominated by two characteristic times, one very short, of the order 
of one millisecond, and the other much longer, of the order of several seconds [79], 
which shows, like the peripheral neural system introduced below, that the mechanical 
somatosensory system operates at several timescales. 

Also of relevance to the design of haptic interfaces is some knowledge of the bulk 
mechanical properties of the extremities, taken as a whole. Again, this subject is better 
tackled in terms of specific tasks. When the human finger interacts with a surface, 
three modes of interaction may be combined: (i) a contact can be made to or released 
from a surface; (ii) the finger can displace the mutual surface of contact through a 
rolling motion; (iii) or it can do so through a sliding motion [34, 35]. Each of these 
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Fig. 3.1 a A punch indenting an ideal solid half-space follows the Boussinesq—Flamant’s defor- 
mation problem, where the elongation follows the pattern indicated by the black line and the shear 
deformation that of the grey line. b Imaging the contact surface indicates that an actual finger grossly 
follows this pattern. However, a 2mm indentation made by a 1 mm punch creates a deformation 
region as large as 6mm that does not have a circular shape, owing to the anisotropy of the skin 
introduced by the ridges. Figure from [36] 
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Fig. 3.2 Equivalent material properties of human ridged skin along and across ridge direction 
(solid lines) for eight different people. For most, the equivalent elasticity in elongation is highly 
depending on the ridge direction and different people can have very different skins. However, when 
the deformation is dominated by shear, then it is much less dependent on load orientation and on 
individuals. Figure from [79] 


modes corresponds to specific mechanics. When contact is made, the contact surface 
grows very fast with normal loading, and normal displacement is accompanied with 
very steep acceleration of the contact force. To wit, a 1 mm indentation of the fingertip 
by a flat surface corresponds to a normal load of less than 0.2 N, but at 2mm the 
normal load is already 10 times larger at 1.0 N, and it takes only an increment of 
0.5 mm to reach the value of 5.0 N [68]; concomitantly, the contact area has reached 
half of its ultimate value for only 0.5 N of load, and past 1.0N, it will not increase 
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significantly, regardless of the load [68], suggesting that representing a fingertip by 
a local convex elastic homogenous solid is far from an being an acceptable model 
in terms of its ability to conform to the gross shape of touched objects. Moreover, 
these properties are very much dependent on the speed at which indentation occurs. 
Pawluk and Howe found that the mechanical response curve under similar conditions 
varied greatly with speed, a 1.0mm indentation applied at 0.2 mm/s causes a loading 
of about 0.2 N, as just mentioned, but the same displacement applied at 80 mm/s 
causes a contact loading of 1.0 N [63]. 

Most frequently, the finger interacts with a rigid object, which either is oscillating 
and/or provides the surface on which the finger slides, in all cases generating oscilla- 
tions in the finger pad. Such occurrences are common during music playing. To model 
and explain these interactions, it is essential to have a model of the bulk mechanics of 
fingertip in the small displacements and over the whole range of frequencies relevant 
to touch, that is DC to about 1 kHz. In the low frequencies, the data can be extracted 
from studies performed in the condition of slow mechanical loading, transient load- 
ing or large displacements [29, 40, 62], but a recent study conducted with the aid of a 
novel mechanical impedance measurement technique [82] has shown that a fingertip, 
despite all the complexities of its local mechanics, may be considered as a critically 
damped mass-spring-damper system with a corner frequency of about 100Hz and 
where the contribution of inertia to the interaction force is negligible at all frequencies 
before elasticity and viscosity [81], see Fig. 3.3. In essence, the fingertip is domi- 
nantly elastic below 100 Hz and dominantly viscous above this frequency. In the high 
frequencies (>400 Hz), the fingers exhibit structural dynamics that have an uncer- 
tain origin. Quite surprisingly, the fingertip bulk elasticity (of the order of 1 N/mm), 
viscosity (of the order of 1 N s/mm) and equivalent inertia (of the order of 100 mg) 
are by-and-large independent from a tenfold variation of the normal load. It can be 
surmised that these properties hold true for all volar regions of the hands and feet. 

Friction is arguably the most important aspect of the haptic function since without 
it we could scarcely feel and manipulate objects. Because the finger is a biological, 
living object, it has properties which often escape our intuition, especially concerning 
its frictional properties, that latter having a major impact on the manipulative motor 
function as well as on its detection and discriminative function [1]. All the afore- 
mentioned mechanosensitive sensors in the skin and deep tissues are in fact likely to 
respond to friction-induced phenomena. A good example of that is any attenuation of 
the sensitivity of these receptors, for example by a situation as banal as cold hand or 
dry hands, invariably results in an increase in the grip force as a strategic response of 
the brain to sensory deficit. This was also documented when fingers are dry since dry 
skin is more slippery [2]. As another example, recent studies in hedonic touch have 
established a link between the sensation of pleasantness and the skin’s tribological 
properties that in turn influence the physics of contact [47]. 

Some key points to keep in mind. First, the notion of coefficient of friction in 
biotribology must be complemented by the notion of load index, which describes the 
dependency between net normal load and the net traction, since in most cases of prac- 
tical importance Amontons’ first law, stating that friction is empirically independent 
from the apparent contact area, does not hold. A second point is the importance of 
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Fig. 3.3 Fingerpad impedance for small displacements. Figure from [81] 


the presence of water in the physics of the contact owing to the fact that keratin is the 
building material of the stratum corneum. Keratin is akin to hydrophobic polymers 
with the effect that traction increases with the presence of water despite the reduction 
of the interfacial shear strength. This is true up to a point where, in fact, excess of 
water hydrodynamically decreases friction in competition with the former effect. A 
third complicating factor is that the presence of water plasticises the stratum corneum 
with the consequence of dramatically increasing the effective contact area, which is a 
phenomenon that occurs at the molecular level [19]. A fourth factor is the very large 
effect of time on the frictional dynamics. In fact, all these four factors dominate the 
generation of traction as opposed to the normal gripping load, in direct opposition to 
the simplistic friction models adopted in the greatest majority of neuroscience and 
robotic studies [1]. Furthermore, this physics depends completely on the counter sur- 
face interacting with the fingers, where the material properties, the roughness of the 
surface and its structural nature (say wood) interact with the physiology of sudation 
(perspiration) through an autonomic function performed by the brain [2]. 


3.3 Sensory Organs 


3.3.1 Muscles, Tendons and Joints 


Muscles are primarily elastic systems that develop a tensional force that depends on 
several factors among which are at their activation level and their mechanical state, 
often simplified to just a length. At rest, a muscle behaves passively, like a nonlinear 
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spring that becomes stiffer at the end of its range. When activation is increased from 
rest to full activation, the active contribution to the passive behaviour is greatest at 
midrange. As a result, for a given activation level, a muscle looses tonus if it is too 
short or too long. A muscle that shortens at high speed produces very little tension, 
while a lengthening muscle gives a greater tension, like a one-way damper. It must 
be noted that the neuromuscular system takes several hundreds of milliseconds to 
modulate the activation. Therefore, beyond a few Hertz, the passive portion of the 
dynamics dominates. Skeletal muscles are in great majority organised in agonist— 
antagonist systems [84]. These terms describe the fact that separate muscles or muscle 
groups accelerate or prevent movement by contracting and relaxing in alternation. It 
is nevertheless a normal occurrence that muscles groups are activated simultaneously, 
a behaviour termed co-contraction or co-activation. Co-contraction, which result in 
a set of muscle tensions reaching a quasi-equilibrium around one or more joints, 
enables new functions, such as stabilisation of unstable tasks [8]. The behaviour of 
an articulation operating purely in an agonist or antagonist mode is nevertheless very 
different from that of the same articulation undergoing co-contraction. 

A consequence of co-contraction which is relevant to our subject is to stiffen the 
entire biomechanical system. This can be made evident when grasping an object. 
Take for instance a ruler between the thumb and the index finger, grip it loosely 
and note the frequency of the pendulum oscillation. Tightening the grip results in 
a net increase of this frequency as a consequence of the stiffening of all the tissues 
involved, including the muscles that are co-contracting: a tighter grip resists better 
to a perturbation. This also means that the musculoskeletal system can modulate 
stiffness at a fixed position, for instance when grasping. This observation requires to 
consider any linear model of the musculoskeletal system with much circumspection. 

We can now see how this system can contribute to the sensation of the weight 
of objects since in one of the strategies employed by people in the performance of 
this perceptual task is to aim at reaching a static equilibrium where velocity tends 
towards zero, a condition that must be detected by the central nervous system. For 
instance, when it comes to heaviness, it has been noticed many times that subjects 
tend also to adopt a second strategy where rapid oscillations are performed around a 
point of equilibrium. In the latter case, it is possible to suppose that it is the variation 
of effort as a function of movement and of its derivative that provides information 
about the mass (and not about the weight). Muscles are connected to the skeleton by 
tendons which also have mechanoreceptors called the Golgi organs. These respond 
to the stress to which they are subjected and report it to the central nervous system, 
which is thus informed of the effort applied by the muscles needed to reach a static 
or dynamic equilibrium. 

The joints themselves include mechanoreceptors. They are located in the joint 
capsule, which is a type of sleeve made of a dense network of connective tissues 
wrapping around a joint and containing the synovial fluid. These receptors—the so- 
called Ruffini corpuscle—respond to the deformation of the capsule and appear to 
play a key role when the joint approaches the end of its useful range of movement, 
in which case some fibres of the capsule begin stretching [28]. 
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The sensory organs of the musculoskeletal system give us the opportunity to intro- 
duce a great categorisation within the fauna of mechanoreceptors, namely rapidly 
adapting (RA) and slowly adapting (SA) receptors. The distinction is made on a 
simple basis. When a RA receptor is stimulated by undergoing a deformation, it 
responds by a volley of action potentials for a duration and a density that is driven 
directly by the rate of change of the stimulus, just like a high-pass filter would (but 
direct analogies with linear filters should be avoided). When a SA-type receptor is 
deformed, it responds for the whole duration of the stimulus but is rather insensitive 
to the transient portion and in that resembles a low-pass filter including the zero 
frequency component. 

This distinction is universal and is as valid for the receptors embedded in ligaments 
and capsules (SA) as for those located in muscles and in the skin (SA and RA). To 
pursue the analysis of the perception of object properties, such as shape, we can 
realise that the joints too are involved in this task, since any muscular output and any 
resulting skeletal movement have an effect on the joints in the form of extra loading, 
relative sliding of structures and connective tissue deformation. These observation 
illustrates the conceptual difficulties associated with the study of the haptic system, 
namely that it is practically impossible to associate a single stimulus to an anatomical 
classification of the sources of information. 


3.3.2 Glabrous, Hairy and Mucosal Skin 


The body surface is covered with skin. As mentioned above, it is crucial to distinguish 
three main types of skin having very different attributes and functions. The mucosal 
skin covers the ‘internal’ surfaces of the body and are in general humid. The gums 
and the tongue are capable of vitally important sensorimotor functions [7, 39, 75]. 
The tongue’s capabilities are astonishing: it can detect a large number of objects’ 
attributes including their size, their shape, very small curvature radii, hardness and 
others. Briefly, one may speculate that the sensorimotor abilities of the tongue are 
sufficient to instantly detect any object likely to cause mechanical injury in case of 
ingestion (grains of sand, fish bones). 

The glabrous skin has a rather thick superficial layer made of keratin (like hairs) 
which is not innervated. The epidermis, right under it, is living and has a special 
geometry such that the papillae of the epidermal—dermal junction are twice as fre- 
quent as the print ridges. The folds of the papillae house receptors called Meissner 
corpuscles, which are roughly as frequent in the direction transversal to the ridges as 
in the longitudinal direction. The Merkel complexes (which comprise a large number 
of projecting arborescent neurites) terminate on the apex of the papillae matching 
the corresponding ridge, called the papillary peg. The hairy skin does not have such 
a deeply sculptured organisation. In addition, each hair is associated with muscular 
and sensory fibres that innerve an organ called the hair follicle. 

This geometry can be better appreciated if considered at several length scales and 
under different angles. A fingerprint shows that the effective contact area is much 
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smaller than the touched surface. The distribution of receptors is highly related with 
the geometry of the fingerprint. In particular, the spatial frequency of the Meissner 
corpuscles is twice that of the ridges. On the other hand, the spatial frequency of the 
arborescent terminations of the Merkel complexes is the same as that of the ridges. 
This geometry explains why the density of Meissner corpuscles is roughly five times 
greater than that of the Merkel complexes [37, 45, 55, 59]. Merkel complexes, 
however, come in two types. The other type forms long chains that run on the apex of 
the papillae [60]. The distinctive tree-like structure of this organ terminates precisely 
at the dermal—epidermal interface. 

It is useful to perform simple experiments to realise the differences in sensory 
capabilities between glabrous and hairy skin. It suffices to get hold of rough surfaces, 
such as a painted wall or even sand paper, and to compare the experience when 
touching it with the fingertip or with the back of the hand. Try also to get hold 
of a Braille text and to try to read it with the wrist. The types of receptors seem 
to be similar in both kinds of skin, but their distribution and the organisation and 
biomechanical properties of the respective skins vary enormously. One can guess 
that the receptor densities are greatest in the fingertips. There, we can have an idea of 
their density when considering that the distance between the ridges of the glabrous 
skin is 0.3-0.5 mm. 

The largest receptor is the Pacini corpuscle. It is found in the deeper regions of the 
subcutaneous tissues (several mm) but also near the skin, and its density is moderate, 
approximately 300in the whole hand [11, 71]. It is large enough to be seen with 
the naked eye, and its distribution seems to be opportunistic and correlated with the 
presence of main nervous trunks rather than functional skin surfaces [32]. Receptors 
of this type have been found in a great variety of tissues, including the mesentery, but 
near the skin they seem to have a very specific role, that of vibration detection. The 
Pacinian corpuscle allows to introduce a key notion in physiology, that of specificity 
or ‘tuning’. It is a common occurence in all sensory receptors (be it chemorecep- 
tors, photoreceptors cells, thermoreceptors or mechanorectors) that they are tuned 
to respond to certain classes of stimuli. The Pacinian corpuscle does not escape this 
rule since it is specific to vibrations, maximising its sensitivity for a stimulation 
frequency of about 250 Hz but continuing with decreasing sensitivity to 1000 Hz. It 
is so sensitive that, under passive touch conditions, it can detect vibrations of 0.1 
micrometer present at the skin surface [78]. Even higher sensitivity was measured 
for active touch: results addressing a finger-pressing task are reported in Sect. 4.2. 

The Meissner corpuscle, being found in great numbers in the glabrous skin, plays 
a fundamental role in touch. In the glabrous skin, it is tucked inside the ‘dermal papil- 
lae’, and thus in the superficial regions of the dermis, but nevertheless mechanically 
connected to the epidermis via a dense network of connective fibres. Therefore, it is 
the most intimate witness of the most minute skin deformations [72]. One may have 
some insight into its size by considering that its ‘territory’ is often bounded by sweat 
pores [55, 60]. 

Merkel complexes, in turn, rather than being sensitive axons tightly packed inside 
a capsule, have tree-like ramifications that terminate near discoidal cell, the so-called 
Merkel cells. In the hairy skin, these structures are associated with each hair. They 
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also very present in mucoscal membranes. In the glabrous skin, they have up to 
50 terminations for a single main axon [30]. The physiology of Merkel cells is 
not well understood [54]. They would participate in mechanotransduction together 
with the afferent terminals to provide these with a unique firing pattern. In any case, 
Merkel complexes are associated with slowly adaptive responses, but their functional 
significance is still obscure since some studies show that they can provide a Pacinian- 
type synchronised response up to 1500 Hz [27]. 

The Ruffini corpuscle, which we already encountered while commenting on joint 
capsules, has the propensity to associate itself with connective tissues. Recently, it has 
been suggested that its role in skin-mediated touch is minor, if not inexistent, since 
glabrous skin seems to contain very few of them [58]. This finding was indirectly 
supported by a recent study implicating the Ruffini corpuscle not in mechanical 
stimulation due to direct contact with the skin, but rather in the connective tissues 
around the nail [5]. Generally speaking, the Ruffini corpuscle is very hard to identify 
and direct observations are rare, even in glabrous skin [12, 31]. 

Finally the so-called C fibres, without any apparent structure, innervate not only 
the skin, but also all the organs in the body and are associated with pain, irritation and 
also tickling. These non-myelinated, slow fibres (about 1 m/s) are also implicated in 
conscious and unconscious touch [76]. It is however doubtful that the information 
that they provide participates in the conscious perception of objects and surfaces 
(shape, size, or weight for instance). This properties invite the conclusion that the 
information of the slow fibres participates in affective touch and to the development 
of conscious self-awareness [56]. 

From this brief description of the peripheral equipment, we can now consider the 
receptors that are susceptible to play a role in the perception of external mechan- 
ical loading. As far as the Ruffini corpuscles are concerned, several studies have 
shown that the joints, and hence the receptor located there, provide proprioceptive 
information, that is estimation of the mechanical state of the body (relative limb 
position, speed, loading). It is also possible that they are implicated in the perception 
of the deformation of deep tissues which occurs when manipulating a heavy object. 
It might be surprising, but the central nervous system becomes aware of limb move- 
ments not only by the musculoskeletal system and the joints, but also by the skin and 
subcutaneous tissues [22]. 

It is clear that the receptors that innerve the muscles also have a contribution 
to make, since at the very least the nervous system must either control velocity 
to zero, or else estimate it during oscillatory movements. Muscles must transmit 
an effort able to oppose the effects of both gravity and acceleration in the inertial 
frame. Certainly, Golgi organs—which are located precisely on the load path—would 
provide information, but only if the load to be gauged is significantly larger than that 
of the moving limb. Lastly, the gauged object in contact with the hand would deform 
the skin. From this deformation, hundreds of mechanoreceptors would discharge, 
some transitorily when contact is made, some in a persisting fashion. 

At this point, it should be clear that the experience of the properties of an object, 
such as its lack of mobility, is really a “perceptual outcome’ arising from complex 
processing in the nervous system and relying on many different cues, none of which 
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alone would be sufficient to provide a direct and complete measurement about any 
particular property. This phenomenon is all the more remarkable, since, say a sax- 
ophone, seems to have the same weight when is held with the arms stretched out, 
squeezed between two hands, held by the handle with a dangling arm, held in two 
arms—among other possibilities—each of these configurations involving distinct 
muscle groups and providing the nervous system with completely different sets of 
cues! 


3.3.3 Electrophysiological Response 


3.3.3.1 Categories of Responses 


The idea behind the study of the electrophysiological response is to measure directly 
the signals transmitted by the neurons, the so-called action potentials. This measure- 
ment can be done by inserting electrodes in peripheral nerves, something that can 
be done in people without measurable consequences for health. It is when making 
such measurements that it was realised that there existed the two types of responses 
already mentioned (SA & FA). It is nevertheless important to distinguish the capacity 
that has a given receptor to respond to fast stimuli from the type of responses. 

For the receptors located in the skeletomuscular system, it is relatively easy to 
determine their response mode from the anatomy, but in the skin this is not possible. 
Mechanoreceptors, with the exception of the Pacinian corpuscle, are very small and 
very dense, and recording is only possible at some distance (wrist, arm, leg). The 
consensus is that the Ruffini corpuscles (not observed in the glabrous skin) are of the 
SA type and so are the Merkel complexes. On the other hand, the Meissner corpuscle 
is of the FA type. 

Some of these inferences are made by stimulating the skin with von Frey filaments, 
from Max von Frey who introduced them at the end of the nineteenth century as a 
calibrated method to stimulate touch. Using this method, it is possible to determine 
that certain afferent nerve fibres respond from stimulating a tightly limited territory, 
say of a size of 2 mm (type I), while some others respond to stimulation applied 
within a much wider territory, up to one centimetre in size, or more (type II). This 
physiological distinction—yet not anatomical—gives rise to four possibilities: FA-I, 
FA-II, SA-I, SA-II. The receptive fields are very varied in shape and sizes through- 
out the surface of the body, frequently overlapping, and often, they do have clear 
borders [42, 43, 46, 77]. 

Most mechanical phenomena at play, however, are nonlocal; detecting a one mm 
crumb with the finger has mechanical consequences that spread up to 100 mm? of 
skin tissue; sliding the finger on a surface with 10 ym asperities has easily measur- 
able consequences up the forearm [15, 69]. In that sense, it is highly probable that 
most motor and perceptual behaviours simultaneously engage all mechanoreptors’ 
populations [66]. 
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3.3.3.2 Coding Options 


It stands to reason that the flow of the action potentials must be able to encode infor- 
mation arising from peripheral stimulation. Before proceeding further, it is impor- 
tant to recall that information ascending from the periphery is not the only source 
that determines the conscious experience, far from it. In fact, self-generated move- 
ment [13], intention [85], and learning [17], not counting stimuli coming from other 
sensory modalities [18, 34], all modify the conscious percept arising from a same 
stimulation. 

A number of codes have been discovered that represent information arising from 
touch and kinaesthesia neurally. It is likely that many more will be discovered in the 
future. As far as kinaesthetic information is concerned, it was found that the specific 
recruitment of nerve fibres encodes spatially the position of a joint [9]. With regard to 
the direction of movement, it seems plain that the agonist—antagonist organisation of 
the motor system encodes it automatically. The muscle spindles respond specifically 
to velocity by a frequency code: the larger is the amount of change of length per 
unit of time (that is speed), the higher is the number of nerve impulses (or action 
potentials) per unit of time. This code has the property to be resistant to noise and 
perturbations: an action potential missed or fired accidentally does not make a great 
difference over a long period of time. On the downside, this code is by construction 
not temporally precise because it takes a minimum number of action potentials to 
encode a rate. 

As far as touch is concerned, codes are still mysterious but a few have been 
found. For low intensity stimulation, certain FA receptors behave like oscillators 
synchronised with the waveform [65], which corresponds to a temporal code. In 
touch, it is also clear that spatial coding is fundamental. For instance, when reading 
Braille each dot specifically stimulates a small population of receptors which convey 
the presence of the dot [26]. The shape of a touched object can be directly coded 
by the contact surface [49]. Other codes, however, are likely to be at play. When 
a fingertip is mechanically loaded ramping from rest to a maximal value in the 
tangential direction—an event that occurs each time we pick up an object—it was 
shown that this event is represented by a correlation code [41]. This means that is 
the temporal coincidence of two or more action potentials that convey the nature 
of the mechanical interaction between the finger and the object. It has also been 
shown that when a finger slips on a surface with a single asperity, action potentials 
are synchronised with encounter of this asperity with each ridge of the print, which 
corresponds to an extremely fine spatiotemporal code [50]. 

During gripping, the recruitment code has also been documented as coding directly 
in skin coordinates [26]. A similar observation can also be made of curvature, since 
the ratio between the contact surface and the normal load depends on it [25]. It is 
highly probable that sliding and sticking and transitions between these two states 
are coded by the relative response of RA and SA populations, which is another 
form of correlation [70]. Another important attribute of a contact detected by touch 
is simply the average load—namely its direction and magnitude in the normal and 
tangential directions [4]—which leads to believe that generally information is coded 
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by receptor populations and not by individual ones. It is also probable that the elastic 
properties of the touched object are coded peripherally and specifically by composite 
populations in space and time. Last but not least, the coding of texture, or rather of 
the micro-geometry of surfaces that interact with the glabrous skin, was the subject 
of a considerable number of studies [38]. Despite these works, it is likely that most 
of the codes employed by primates remain to be discovered. 

The question of codes can also be considered from the viewpoint of the physio- 
logical response of receptors. Unfortunately, this approach is fraught with numerous 
difficulties. It is very rare when one can stimulate specifically one particular receptor 
and to measure its response. Since stimulation can only be effected from the surface 
of the skin, even the most concentrated indentations have consequences far away 
from the contact site: deformation propagates several millimetres around the zone of 
stimulation [14]. As a result, it is generally impossible to associate a physiological 
response to a particular anatomical characteristic. 

Due to its size, the Pacinian corpuscle is nevertheless an exception because it 
is possible to study its response in vitro [3, 6]. It has interesting characteristics 
some of which are shared with Merkel complexes [27]. The first peculiarity is a 
frequency-dependent sensitivity: the deformation needed to trigger a single action 
potential is smallest at 250 Hz. In this condition, the discharge of action potentials 
is synchronous with the stimulation, giving a direct temporal code. If amplitude 
is reduced, the corpuscle looses this synchronicity property but still responds over 
several cycles to truly microscopic deformations. This feature translates into transfer 
function with a strong, obvious nonlinear jumping behaviour. For a given frequency, 
the response does not change with amplitude over a range, but once a threshold is 
reached, a frequency doubling is observed. 

Taking the example of the perception of the weight of an instrument, it should 
become increasingly clear that such perception does not result from a single or simple 
family of neural signals, but from a veritable jungle of motor and sensorial signals 
whose conscious perception is that of a unitary percept attributed to the held object. 
This could contribute to explain why the motor system and the perceptual seem 
to operate independently from each other, at least when it comes to the conscious 
knowledge of either action or perception [23, 64]. 


3.4 Central Organs 


It is not easy to paint a concise and logical picture of the central nervous organisation 
of the haptic system. Besides, it would be misleading to believe that it can be confined 
to a small number of functionally and anatomically well-delimited cortical areas, 
ganglions and pathways. The discovery of this organisation is a work in progress. 
Originally discovered due to the random consequences of war, accidents, diseases, 
surgical innovations, and today with electrophysiology (in humans, but mostly in 
monkeys and rats) and brain imaging techniques (pet, fMRI, and very recently optical 
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imaging), it can be said that the representation that is made of this organisation 
constantly changes with the introduction of new techniques. 

Nevertheless, it is useful to have a general idea of the great structures [44]. Sensory 
pathways ascend through the spine and first project on dorsal column nuclei which 
in turn project onto the ventral posterior nucleus of the thalamus, located at the apex 
of the spine, right at the centre of the cranium. Many functions are ascribed to the 
thalamus, but one of them is to transmit all sensory afferent information (with the 
exception of olfaction and vestibular inputs) to the cortical regions. This organ seems 
to be able to process peripheral information into a form that is suitable for cortical 
processing. 

The somatosensory cortex is located on both sides of the great parietal circum- 
volution, and a huge number of fibres project onto it. The cortex is divided into two 
main areas, SI (primary) and SII (secondary), on each side of the central parietal 
sulcus. According to Brodman’s nomenclature [86], SI is divided into four areas: 1, 
2, 3a and 3b, based on their neuronal architectures. Thalamic fibres terminate for 
the most part in 3a and 3b which are, in turn, connected to areas | and 2, portraying 
a hierarchical organisation where, like in the other sensory modalities, increasingly 
abstract representations are successively formed. One believes, for instance, that area 
1 is implicated in the representation of textures, that area 2 encodes size and shape, 
and that areas 3a and 3b are dedicated to lower-level processing. It has been discov- 
ered that two other areas of the parietal posterior region, 5 and 7, are also involved in 
haptic processing. In any case, the somatotopic organisation progressively reduces 
with the distance from peripheral inputs. 


3.5 Conclusions 


The somatosensory system is distributed throughout the entire body with mechanical, 
anatomical and physiological attributes that vary greatly with the regions considered. 
These variations can be explained by the mechanical function of each organ: the 
fingertip is very different from, say, the elbow, the lips or the tongue. It is therefore 
tempting to relate these attributes to common motor functions, such as gripping, 
throwing objects, eating or playing musical instruments. 
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in Musical Performance 
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Abstract We suggest that studies on active touch psychophysics are needed to 
inform the design of haptic musical interfaces and better understand the relevance 
of haptic cues in musical performance. Following a review of the previous litera- 
ture on vibrotactile perception in musical performance, two recent experiments are 
reported. The first experiment investigated how active finger-pressing forces affect 
vibration perception, finding significant effects of vibration type and force level on 
perceptual thresholds. Moreover, the measured thresholds were considerably lower 
than those reported in the literature, possibly due to the concurrent effect of large 
(unconstrained) finger contact areas, active pressing forces, and long-duration stim- 
uli. The second experiment assessed the validity of these findings in a real musical 
context by studying the detection of vibrotactile cues at the keyboard of a grand 
and an upright piano. Sensitivity to key vibrations in fact not only was highest at 
the lower octaves and gradually decreased toward higher pitches; it was also signif- 
icant for stimuli having spectral peaks of acceleration similar to those of the first 
experiment, i.e., below the standard sensitivity thresholds measured for sinusoidal 
vibrations under passive touch conditions. 
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4.1 Introduction 


For what we have seen in Chap.3, the somatosensory system relies on input from 
receptors that operate within deformable human tissues. One solution for measuring 
their activity precisely is to keep those tissues free from any kinematic perturba- 
tion. Such experiments—in which subjects were typically stimulated with vibra- 
tions at selected areas of their skin while remaining still—have set the roots of the 
psychophysics of passive touch. However, as Gibson observed in 1962, “passive 
touch involves only the excitation of receptors in the skin and its underlying tissue,” 
while “active touch involves the concomitant excitation of receptors in the joints 
and tendons along with new and changing patterns in the skin” [24]. This observa- 
tion suggests that the psychophysics of active touch may exhibit relevant differences 
from the passive case. Furthermore, a systematic investigation of active touch psy- 
chophysics presents additional practical difficulties in experimental settings due to 
interactivity, which seems to motivate the current lack of results in the field. Even if 
we assume a small and well-defined vibrating contact at the fingertip, any change in 
this contact—as typically found in finger actions such as sliding or pressing—gives 
rise to new normal and longitudinal forces acting on the skin and to different contact 
areas. Such side-effects are indeed known to alter the tactile percept [9, 10, 28, 34, 
36, 54]. The surrounding skin regions, which contribute to tactile sensations, are also 
dynamically affected by such changes and by the patterns of vibrations propagating 
across them [49]. 

The perception of vibrations generated by musical instruments during playing 
does not make an exception to the above mechanisms. In fact, the respective experi- 
mental scenario is conceptually even more complicated and technically challenging. 
While in general tactile stimuli may be controlled reasonably well in active touch 
psychophysics experiments, when considering instrumental performance one has to 
take into account that vibrations are elicited by the subjects themselves while playing 
and that concurrent auditory feedback may affect tactile perception [30, 46, 50, 59]. 

As explained in Chap. 2, a tight closed loop is established between musicians and 
their instruments during performance. Experimentation on active touch in the context 
of musical performance hypothesizes that tactile feedback affects such interaction 
in a number of ways and eventually has a role in the production of musical sounds. 


4.1.1 Open-Loop Experimentation 


The study of haptic properties of musical instruments outside of the musician— 
instrument interaction (i.e., in open loop) conceptually simplifies the experimental 
design, while effectively preparing the ground for further studies in closed loop. 
The violin, due to its intimate contact with the player, represents one of the most 
fascinating instruments for researchers in musical haptics. A rich literature has grown 
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to explain the physical mechanisms at the base of its range of expressive features [60]. 
However, the mechanical coupling of the violin with the performer is strong, so that 
its vibratory response measured in free-suspension conditions cannot fully represent 
the vibrotactile cues generated by the instrument when in use [38]. 

The vibratory response of the piano is relatively easier to assess, as the 
instrument’s interface with the musician is limited to the keyboard and pedals. Fur- 
thermore, the mass of the piano is such that the mechanical coupling with the per- 
former’s limbs cannot affect its vibrations significantly. However, pianos couple 
with the floor; hence, vibrations can reach the pianist’s body through it and the 
seat. Piano vibrations have been carefully studied by researchers in musical acous- 
tics, who measured them mainly at the strings or soundboard [51]. In contrast, key- 
board vibrations as conveyed to the player have been less researched. In the early 
1990s, Askenfelt and Jansson performed extensive measurements on several stringed 
instruments, including the double bass, violin, guitar, and piano [4]. Overall, vibra- 
tion amplitude was measured above the standard sensitivity thresholds for passive 
touch [54], suggesting a role for tactile feedback at least in conveying a feeling of 
a resonating and responding object. This conclusion, though, was mitigated for the 
piano keyboard, whose vibration amplitude was mostly found below such thresh- 
olds and hence supposedly perceptually negligible. More recently, Keane and Dodd 
reported significant differences between upright and grand piano keyboard vibra- 
tions, while hypothesizing a perceptual role of vibrotactile feedback during piano 
playing [32]. 

Other classes of instruments, such as aerophones, likely offer measurable vibro- 
tactile cues to the performer, but to our knowledge a systematic assessment of the 
perceivable effects of such vibratory feedback has not been yet conducted. 

Percussion instruments, on the other hand, respond with a strong kinesthetic feed- 
back that is necessary for performers to rearm their limbs instantaneously, and for 
executing rebounds and rolls without strain. In this regard, Dahl suggested that the 
interaction of a drumstick or a hand with the percussion point happens so rapidly, 
that it does not seem possible for a performer to adjust a single hit simultaneously 
with the tactile feedback coming from it [11]. The percussive action, in other words, 
appears to be purely feed-forward as far as multiple hit sequences are not considered 
(see also Sect. 2.2 in this regard). Finally, electroacoustic and electronic instruments 
do not seem able to generate relevant vibrotactile feedback, unless a loudspeaker 
system is mounted directly aboard them. 


4.1.2 Experiments with Musicians 


Once an instrument has been identified as a source of relevant tactile cues, their 
potential impact on musical performance and produced musical sound may be tested 
with musicians. The inclusion of human participants, however, introduces several 
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issues. To start with, as mentioned above, interactive contexts such as the musical 
one prevent the implementation of experiments with full control over contact areas 
and forces, or the generation of vibratory stimuli. Also, acoustical emissions from 
musical instruments engage musicians in a multisensory process where the tactile and 
auditory channels are entangled at different levels, ranging from the peripheral and 
central nervous system, to cross-modal perceptual and cognitive processes. Tactile 
and auditory cues start to interfere with each other in the middle ear. Vibrations 
in fact propagate from the skin to the cochlear system through bone and tendon 
conduction, via several pathways [12]. Especially if an instrument is played close 
to the ear (e.g., a violin) or enters into contact with large areas of the body (e.g., a 
cello or double bass), such vibrations can reach the cochlea with sufficient energy to 
produce auditory cues. Cochlear by-products of tactile feedback may be masked by 
overloading the hearing system with sufficiently loud sound that does not correlate 
with tactile feedback: Masking noise provided through headphones is often necessary 
in tactile perception tests [6, 58]. The use of bone-conduction headphones may 
improve experimental control, as bone-conducted cues could be jammed on their way 
to the cochlea by vibratory noise transferred to the skull [47]. Even when considering 
only airborne auditory feedback, earmuffs or earplugs may not provide sufficient 
cutoff, and uncorrelated masking noise may be needed. The question, then, is how 
to analyze answers from musicians who had to perform while listening to loud 
noise. The literature on audio-tactile sensory integration is particularly rich and can 
help explain possible perceptual synergies or cancellations occurring during this 
integration [46, 50, 57, 58]. 

Any tactile interaction experiment that involves musicians should take the afore- 
mentioned issues into account. In a groundbreaking study from 2003, Galembo 
and Askenfelt showed that grand pianos are mainly recognized—and possibly even 
rated—based on the tactile and kinesthetic feedback offered by their keyboards, 
more than based on the produced sound [20]. Similarly, in a later study on percussive 
musical gestures, Giordano et al. showed that haptic feedback has a bigger influence 
on performance than on auditory cues [25]. Focusing on tactile cues alone, Keane 
and Dodd reported significant preference of pianists for an upright instrument whose 
keybed had been modified to decrease vibrations intensity at the keyboard, thus mak- 
ing them comparable to those produced by a grand piano [31, 32]. In parallel, some 
authors of the present chapter augmented a digital piano with synthesized vibrotactile 
feedback, showing that it significantly modified the performer’s preference [16, 18]. 
In the same period, one of the world’s top manufacturers equipped its flagship digital 
pianos with vibration transducers making the instruments’ body vibrate while play- 
ing [27], thus testifying concrete interest from the industry at least for the aesthetic 
value of tactile cues. 

More recently, Wollman et al. showed that salient perceptual features of violin 
playing are influenced by vibrations at the violin’s neck [59], and Altinsoy et al. 
found similar results using reproduced vibratory cues [3]. Saitis et al. discussed the 
influence of vibrations on quality perception and evaluation as manifested in the way 


4 Perception of Vibrotactile Cues in Musical Performance 53 


that musicians conceptualize violin quality [48]. Further details on the influence of 
haptic cues on the perceived quality of instruments are given in Chap. 5. 


4.1.3 Premises to the Present Experiments 


Compared to other interfaces of stringed instruments, the piano keyboard is easier 
to control experimentally, as the performer is only supposed to hit and then release 
one or more keys with one or more fingers. Other body contacts can be prevented by 
excluding the use of the pedals. Also, non-airborne auditory feedback—a by-product 
of the tactile response—can be masked by employing the techniques mentioned 
above. Furthermore, the sound and string vibrations produced by a key press are in 
good correspondence with the velocity with which the hammer hits a string [33]. 
If a keyboard is equipped with sensors complying with the MIDI protocol, then 
such map is encoded for each key and made available as digital messages. Together, 
these properties allow the experimenter to (i) record the vibratory response of the 
keyboard to measurable key actions; (ii) create a database of reproducible action— 
response relationships; (iii) make use of those data in experiments where pianists 
perform simple tasks on the keyboard, such as hitting one or few keys. 

Our interest in the piano keyboard is not only motivated by its relatively easy 
experimental control: As mentioned above, its tactile feedback measured in open 
loop was found hardly above the standard vibrotactile sensitivity thresholds [4]. Did 
this evidence set an end point to the perception of piano keyboard vibrations? This 
chapter discusses and compares the results of two previously reported experiments 
on vibrotactile perception in active tasks: The first one conducted in a controlled 
setting and the other in an ecological, musical setting. The goal was twofold: (i) to 
assess how finger pressing (similar to a key-press task) affected vibrotactile detection 
thresholds and (ii) to investigate whether pianists perceive keyboard vibrations while 
playing. 

Somewhat surprisingly, in Experiment | we found sensitivity thresholds much 
lower than those previously reported for passive tasks. Experiment 2 demonstrated 
that pianists do perceive keyboard vibration, with detection rates highest at the lower 
octaves and gradually decreasing toward higher pitches. Importantly, vibrations at 
the piano keyboard were also measured with an accelerometer for the conditions 
used in the experiment: While their intensity was generally lower than the standard 
thresholds for passive touch, conversely a comparison with the thresholds obtained 
in Experiment | provided a solid explanation to how pianists detected vibrations 
across the keyboard. 

These findings suggest that studies on active touch psychophysics are required to 
better understand the relevance of haptic cues in musical performance and, conse- 
quently, to inform the development of future haptic musical interfaces. 
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4.2 Experiment 1: Vibrotactile Sensitivity Thresholds 
Under Active Touch Conditions 


In this experiment, vibrotactile perceptual thresholds at the finger were measured for 
several levels of pressing force actively exerted against a flat rigid surface [43]. Vibra- 
tion of either sinusoidal or broadband nature and of varying intensity was provided 
in return. The act of pressing a finger is indeed a gesture found while performing on 
many musical instruments (e.g., keyboard, reed, and string instruments) and there- 
fore represents a case study of wide interest for musical haptics. Based on the results 
reported by several previous studies [9, 10, 28, 34, 36], we expected perceptual 
thresholds to be influenced by the strength of the pressing force. 


4.2.1 Setup 


A self-designed tabletop device called the Touch-Box was utilized to measure the 
applied normal force and area of contact of a finger pressing its top surface and to 
provide vibrotactile stimuli in return. Technical details on the device are given in 
Sect. 13.3.1. The Touch-Box was placed on a thick layer of stiff rubber, and sound 
emissions were masked by noise played back through headphones. To minimize 
variability of hand posture, an arm rest was used. 

The experiment made use of two vibrotactile stimuli, implementing two different 
conditions: Band-passed white noise with 48 dB/octave cutoffs at 50 and 500 Hz and 
a sine wave at 250 Hz. Both stimuli focus around the range of maximal vibrotactile 
sensitivity (200-300 Hz [55]). During the experiment, stimulus amplitude was varied 
in fixed steps according to a staircase procedure (see Sect. 4.2.2). Stimulus level was 
calculated as the RMS value of the acceleration signal, accounting for the power of 
vibration acceleration averaged across the stimulation time. 

Pressing force was a within-subject condition with three target levels, covering 
a range from light touch to hard press, while still being comfortable for partici- 
pants [13], as well as compatible with forces found in instrumental practice [4]. In 
what follows, the three force levels are referred to as Low, Mid, and High, which 
correspond, respectively, to 1.9, 8, and 15 N, with a tolerance of +1.5N. 


4.2.2 Procedure 


Twenty-seven subjects participated in the sinusoidal condition, and seventeen in the 
noise condition. They were 19—39-year old (mean = 26, SD = 4.5), and half of them 
were music students. The experiment lasted between 35 and 60 min, depending on the 
participants’ performance, and a 1-minute break was allowed every 5 min to prevent 
fatigue. 
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Fig. 4.1 Thresholds measured at three pressing force levels, for sinusoidal and noise vibrations. 
Error bars represent the standard error of the mean. Figure reprinted from [43] 


Perceptual thresholds were measured using a one-up-two-down staircase algo- 
rithm with fixed step size (2 dB!) and eight reversals, and a two-alternative forced 
choice (2AFC) procedure. The method targets the stimulus level corresponding to a 
correct detection rate of 70.7% [35], estimated as the mean of the last six reversals 
of the up-down algorithm. 

Three staircases were implemented, each corresponding to a target force level, 
which were presented in interleaved and randomized fashion. Participants were 
instructed to use their dominant index finger throughout the experiment. A trial con- 
sisted of two subsequent finger presses, with vibration randomly assigned to only one 
of them. The participants’ task was to identify which press contained the vibration 
stimulus. Before the observation interval began, a LCD screen turning green signaled 
the stable reaching of the requested force level. 


4.2.3 Results 


As shown in Fig. 4.1, at each pressing force level thresholds for sinusoidal vibration 
were lower than for noise. For both vibration conditions, higher thresholds (i.e., worse 
detection performance) were obtained at the Low force condition, while at the other 
two force levels the thresholds were generally lower. The lowest mean threshold 
(68.5 dB RMS acceleration) was measured at the High force condition with sinusoidal 
vibration, and the highest at the Low force condition with noise vibration (83.1 dB)— 
thus thresholds varied over a wide range across conditions. Individual differences 
were also large: The lowest and highest individual thresholds differ typically by about 
20 dB in each condition. 


ln the remainder of this chapter, vibration acceleration values expressed in dB use 10~° m/s? as 
a reference. 
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Perceptual thresholds were analyzed by means of a mixed ANOVA. A significant 
main effect was found for type of vibration (Fi, = 14.64, p < 0.001, generalized 
n? = 0.23) and force level (Fz g2 = 137.5, p < 0.0001, n? = 0.35), while the main 
effect of musical experience was not significant. Post hoc pairwise comparisons with 
Bonferroni correction (sphericity assumption was not violated in the within-subject 
force level factor) indicated that the Low force condition differed from both the Mid 
and High force conditions, for both vibration types (t (82) > 8.85, p < 0.0001 for 
all comparisons). For noise vibration, the difference between Mid and High force 
conditions was significant (t (82) = —3.17, p = 0.02), but the respective contrast 
for sinusoidal vibration was not (t (82) = 1.64, p > 0.05). The difference between 
sinusoidal and noise vibrations was significant for the Low (t (57.44) = 4.37, p < 
0.001) and High (t (57.44) = 4.29, p < 0.001) force conditions, but not for the Mid 
force (t (57.44) = 1.85, p > 0.05). 


4.2.4 Discussion 


Vibrotactile perceptual thresholds were found in the range 68.5-83.1 dB RMS 
acceleration—values that are considerably lower than what generally reported in 
the literature. Maeda and Griffin [36] compared acceleration thresholds from var- 
ious studies addressing passive touch, finding that most of them are in the range 
105-115 dB for sinusoidal stimuli ranging from 100 to 250 Hz. The lowest reported 
acceleration thresholds are 97—98.5 dB, for contact areas (probe size) ranging from 
53 to 176.7 mm? [1, 2, 15]. It is worth noticing that the widely accepted results 
by Verrillo [55] report lowest displacement thresholds of approximately —20 dB 
(re 1076 m) at 250 Hz, equivalent to about 105dB RMS acceleration.” 

The main result of the present experiment is that vibrotactile sensitivity depends 
on the applied pressing force. Thresholds were highest at the Low force condition and 
decreased significantly at both Mid and High force levels. In good accordance with 
what reported in a preliminary study [44], for noise vibration the lowest threshold 
was obtained at the Mid force condition, while at the Low and High conditions 
thresholds were higher, resulting in a U-shaped threshold contour with respect to the 
applied force. However, as shown in Sect. 13.3.1.4, the spectral centroid of the noise 
vibration generally shifted toward 300 Hz and higher frequencies for the Mid and 
High force conditions. Therefore, we suggest that the U-shape of the threshold-force 
curve might be partially due to the response of the Pacinian channel, which shows 
a U-shaped contour over the frequency range 40-800 Hz with maximum sensitivity 
in the 200-300 Hz range [8]. Conversely, for sinusoidal vibrations at 250 Hz, mean 
dB thresholds decreased roughly logarithmically for increasing pressing forces (see 
Fig. 4.1). This simpler trend may be due to the more consistent behavior of our system 


?For a sinusoidal vibration signal s, it is straightforward to convert between acceleration and dis- 
placement: Sacc = Sdispl ` (2x f)?, where f is the frequency. Also, RMS values can be obtained 


directly from peak values: sRMS = Speak / V2. 
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when reproducing simpler sinusoidal vibrations (see Sect. 13.3.1.4). An improved 
version of the Touch-Box would be needed to test whether a similar trend can be 
found when noise stimuli are reproduced more linearly for varying pressing forces. 

Further studies are needed to precisely assess how vibratory thresholds might 
be affected by passive forces of strength equivalent to the active forces used in the 
present study. However, since the Low condition in our experiment was already 
satisfied by applying light pressing force (the measured mean is about 1.49N), it 
may be compared to studies addressing passive static forces. Craig and Sherrick [10] 
found that increasing static force on the contactor produces an increase in vibrotactile 
magnitude. They considered vibration bursts at 20, 80, and 250 Hz lasting 1240 ms, 
contact areas up to 66.3 mm?, and static forces of about 0.12 and 1.2 N. Harada and 
Griffin [28] used a contact area of 38.5 mm? and found that forces in the range 1-3N 
led to significant lowering of thresholds by 2-6dB RMS at 125, 250, and 500 Hz. 
The lowest thresholds reported are however around 100dB RMS acceleration. On 
the other hand, Brisben et al. [9] reported that passive static contact forces from 0.05 
to 1.0N did not have an effect on thresholds. However, with only four participants, 
the statistics of those results are not robust. Nevertheless, the authors suggested that 
extending these investigations to higher forces, as found in everyday life, would 
be important. They also hypothesized that increasing the force beyond 1-2 N could 
lower thresholds by better coupling of vibrating surfaces to bones and tendons, which 
could result in more effective vibration transmission to distant Pacinian corpuscles. 
That might also contribute to explain the generally lower thresholds that we found 
for higher forces. In our study, force level was found strongly correlated to contact 
area, resulting in larger areas for higher forces, which clearly contributed to further 
lowering perceptual thresholds [43]. 

Only a few related studies are found in the literature dealing with non-sinusoidal 
stimuli. Gescheider et al. [22] studied difference limens for the detection of changes 
in vibration amplitude, with either sinusoidal stimuli at 25 or 250 Hz or narrowband 
noise with spectrum centered at 175 Hz and 24 dB/octave falloff at 150 and 200 Hz 
(contact area 2.9 cm7). They found that the nature of the stimuli had no effect on 
difference limens. 

Wyse et al. [61] conducted a study with hearing-impaired participants and found 
that, for complex stimuli and whole hand contact (area of about 50-80 cm’), the 
threshold at 250 Hz was 80 dB RMS acceleration, i.e., comparable with our results, 
especially in the Low force condition. In that study, it is hypothesized that the tem- 
poral dynamics of spectrally complex vibration might play a key role in detecting 
vibrotactile stimulation. In our case, however, the stimuli had no temporal dynam- 
ics. Sinusoidal stimuli resulted in lower RMS acceleration thresholds as compared 
to noise vibration. This may be explained intuitively by considering that equivalent 
RMS acceleration values for sinusoidal and noise stimuli actually result in a similar 
amount of vibration power being concentrated at 250 Hz (a frequency characterized 
by peak tactile sensitivity [55]), or spread across the 50-500 Hz band, respectively. 
This explanation is supported by the findings by Young et al. [64], who reported 
lower thresholds produced by sinusoidal stimuli than spectrally more complex sig- 
nals (square and ramp waves). 
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The Pacinian channel, targeted by this study, is capable of spatial summation. 
Previous studies [21, 55] showed that for contact areas between 2 and 510 mm? 
at the thenar eminence of the hand, and for frequencies in the 40-800 Hz range, 
displacement thresholds decrease by approximately 3 dB with every doubling of the 
area. Intuitively, a reason for that is that the number of stimulated skin receptors 
increases with larger contact areas. In the present experiment, the interactive nature 
of the task resulted in high variability of the contact area [43]. The mean contact 
areas measured in the experiment were in the range 103-175 mm”, contributing to 
explaining the reported enhanced sensitivity. 

The Pacinian channel is also sensitive to temporal summation, which lowers sen- 
sitivity thresholds and enhances sensation magnitude [21]. Verrillo [53] found that 
thresholds decrease for stimuli at 250 Hz for increasing duration up to about 1s, 
when delivered through a 2.9cm? contactor to the thenar eminence of the hand. 
Gescheider and Joelson [23] examined temporal summation with stimulus intensi- 
ties ranging from the threshold to 40 dB above it: For 80 and 200 Hz stimuli, peak 
displacement thresholds were lowered by up to about 8 dB for duration increasing 
from 30 to 1000 ms. The present study made use of stimuli lasting 1.5 s, which likely 
contributed to enhancing vibrotactile sensitivity. 

Large inter-individual differences in sensitivity were found in our experiment, 
which we could not fully explain by contact area or age. However, this observation 
is in accordance with other studies [1, 29, 36, 41]. Sources for large variations in 
sensitivity may be many. While exposure to vibration is a known occupational health 
issue and can cause acute impairment of tactile sensitivity [28], experience in condi- 
tions similar to the present experiment seemed a possible advantage. Therefore, we 
further analyzed the performance of musician participants, who are often exposed to 
vibrations when performing on their instruments: Indeed, musicians’ mean thresh- 
old in the Low force condition was about 3 dB lower than non-musicians’, but there 
was no significant difference at the other force levels. Overall, enhanced sensitiv- 
ity in musicians—previously observed by other authors [14, 45, 65]—could not be 
confirmed. 

By considering actively applied forces and unconstrained contact of the finger pad, 
the present study adopted a somewhat more ecological approach [24] as compared 
to the studies mentioned above. An analogous approach was adopted by Brisben 
et al. [9], who studied vibrotactile thresholds in an active task that required partic- 
ipants to grab a vibrating cylinder. While the exerted forces were not measured, in 
accordance with our results much lower thresholds were reported than in the most pre- 
vious literature: At 150 and 200 Hz, the average displacement threshold was 0.03 ym 
peak (down to 0.01 um in some subjects), which is equivalent to RMS acceleration 
values of 85.5 dB at 150 Hz, and 90.5 dB at 200 Hz. The authors suggested that such 
low figures could be due to the multiple stimulation areas on the hand involved in 
grabbing the vibrating cylinder, the longitudinal direction of vibration, and the force 
exerted by the participants. A few studies report that active movement results in 
lower sensitivity thresholds [63] or better percept possibly due to the involvement of 
planning and additional cognitive load as compared to the passive case [52]. 
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Despite its partially ecological setting, this experiment kept control over the gener- 
ation of sinusoidal and noise vibrations, with focus on the region of maximal human 
vibrotactile sensitivity (200-300 Hz). Vibratory cues at the piano keyboard, however 
similar in form to the respective tones, are more complex than either of the condi- 
tions in Experiment 1 and are likely to be perceived differently depending on the 
type of touch and the number of depressed keys. The following experiment tested 
first vibration detection in a piano-playing task, and second whether active touch 
sensitivity threshold curves of Experiment | could predict the measured results. 


4.3 Experiment 2: Vibration Detection at the Piano 
Keyboard During Performance 


A second experiment investigated vibrotactile sensitivity in a musical setting [19]. 
Specifically, the goal was to measure the ability of pianists to detect vibration at the 
keyboard while playing. Vibration detection was measured for single and multiple 
tones of varying pitch, duration, and dynamics. 


4.3.1 Setup 


The experiment was performed at two separate laboratories using similar setups, cen- 
tered around two Yamaha Disklavier pianos: A grand model DC3 M4 and an upright 
model DUIA with control unit DKC-850. The Disklaviers are MIDI-compliant 
acoustic pianos equipped with sensors for recording performances and electrome- 
chanical motors for playback. They can be switched from normal operation to a 
“silent mode.” In the latter modality, the hammers do not hit the strings and there- 
fore the instruments neither resound nor vibrate, while their MIDI features and other 
mechanical operations are left unaltered. The two setups are shown in Fig. 4.2. 

During the experiment, the normal and silent modes were switched back and forth 
across trials, letting participants receive respectively either natural or no vibrations 
from the keys. In both configurations, participants were exposed to the same auditory 
feedback produced by a physical modeling piano synthesizer (Modartt Pianoteq), set 
to simulate either a grand or an upright piano, and driven in real time by MIDI data sent 
by the Disklaviers. The synthesized sound was reproduced through Sennheiser HDA- 
200 isolating reference headphones (grand piano) or Shure SE425 earphones (upright 
piano). In the latter case, 3M Peltor XSA earmuffs were worn on top of the earphones 
for additional isolation. Preliminary testing confirmed that through these setups the 
Disklaviers’ operating modes (normal or silent) were indistinguishable while listen- 
ing to the piano synthesizer from the performer’s seat position, meaning that any 
acoustic sound coming from the pianos in normal mode was fully masked. 
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Fig. 4.2 The two Disklavier setups used in the experiment. Left: Yamaha DC3 M4 grand piano. 
Right: Yamaha DUIA upright piano. Figure adapted from [19] 


The loudness and dynamic response of the piano synthesizer were preliminary 
calibrated to match those of the corresponding Disklavier model in use (details are 
given in Sect. 13.3.2). 

Participants could sense the instrument’s vibration only through their fingers on 
the keyboard. Other sources of vibration were excluded: The pedals were made 
inaccessible, while the stool, the player’s feet, and the piano were isolated from the 
floor by various means [17]. Vibration measurements confirmed that, as a result of 
the mechanical insulation, playing the piano did not cause vibrations at the player’s 
seat exceeding the noise floor in the room. 

The experiment was conducted under human control, with the help of software 
developed in the Pure Data environment, which was used to: (i) read computer- 
generated playlists describing the experimental trials; (ii) set the Disklavier’s playing 
mode accordingly; (iii) check if the requested tasks were executed correctly; (iv) 
record the participants’ answers. 


4.3.2 Procedure 


Sensitivity was measured at six A tones of different pitch ranging from AO to AS, 
chosen after a pilot study [17], reporting a significant drop in detection above A5. 
Tone duration was either “long” (8 metronome beats at 120 BPM) or “short” (2 
beats), and dynamics either “loud” (mf to ff, corresponding to MIDI key velocities 
in the range 72—108) or “soft” (p to mp, key velocities 36-54). In addition to single 
tones, participants were requested to play three-tone clusters around D4 and DS. 
The experiment consisted of two parts: In part A, participants played long and loud 
single tones; in part B, tone dynamics and duration were modified so as to make the 
detection task potentially harder in the low range, where vibrations should be most 
easily perceived [17]. Additionally, by extending the contact area, the note clusters 
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Table 4.1 Factors and conditions in the piano experiment 


# of keys Pitch Playing style 
Part A 1 A0/A 1/A3/D4/A4/D5/A5 Long and loud 
Part B 1 AO/A1 Short and loud/ 
Long and soft 
3 CDE4/CDES Long and loud 


were expected to facilitate detection in the high range, where sensitivity should be 
low [17]. The conditions are summarized in Table 4.1. 

The experiment followed a 2AFC (yes/no) procedure, which required participants 
to report whether they had detected vibrations during a trial or not. Each condition was 
repeated eight times in normal mode and eight times in silent mode, in randomized 
order. However, part A was performed before part B. 

Participants were instructed to use their index fingers for single keys or fingers 
2-3-4 for chords and to play pitches lower than the middle C with their left hand and 
the rest with the right hand. 

Fourteen piano students participated in the upright piano condition, and fourteen 
in the grand piano condition. Their average age was 27 years and they had in average 
15 years of training, mainly on the acoustic piano. 


4.3.3 Results 


Sensitivity index d’, as defined in signal detection theory [26], was computed for 
each subject and condition as follows: 


d' = Z(hits) — Z(false alarms), 


where Z(p) is the inverse of the Gaussian cumulative distribution function, hits is 
the proportion of “yes” responses with vibrations present, and false alarms is the 
proportion of “yes” responses with vibrations absent. Thus, a proportion of correct 
responses p(c) = 0.69 corresponds to d’ = 1 and chance performance p(c) = 0.50 
to d’ = 0. Perfect proportions 1 and 0 would result in infinite d’ and were therefore 
corrected by (1 — 1/16) and (1/16), respectively [26]. 

Results of part A are presented at the top of Fig. 4.3: Sensitivity was highest in 
the lower range and decreased toward higher pitches. At A4 (440 Hz), vibrations 
were still detected with mean d’ = 0.84, while at D5 (587 Hz) and A5, performance 
dropped to chance level. A mixed ANOVA indicated a significant main effect of 
pitch (F (6, 156) = 26.98, p < 0.001). The results for the upright and the grand 
piano did not differ significantly, nor was there a significant interaction of pitch 
and piano type. The Mauchly test showed that sphericity had not been violated. 


62 F. Fontana et al. 


— Grand 


Upright 


d' 


Pitch 


— Grand 


Upright 


d' 


O £ = tc + Qo + U 4 Q [o] 
2 5 6 x3 8 6 € ò § € 4 § € 
N 
3 < 2 k $ S 
2 
z < ï iB 
A a 
la [S] 


Pitch and condition 


Fig. 4.3 Sensitivity d’ in part A (top) or parts A and B (bottom). Error bars represent the standard 
error of the mean [40]. Chance performance (d’ = 0) is represented by the dashed line. Figures 


reprinted from [19] 
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The results were collapsed over upright and grand pianos, and a trend analysis was 
conducted. A linear trend was significant (t (156) = — 12.3, p < 0.0001), indicating 
that as pitch increases, sensitivity to vibrations decreases. Results from parts A and 
B are presented together at the bottom of Fig.4.3, showing small differences in 
mean sensitivity between normal, soft, and short conditions. However, none of the 
contrasts between long and short duration or loud and soft dynamics at AO or Al 
was significant. The difference was more notable between clusters and single notes: 
For the cluster CDE4, sensitivity was significantly higher than for the isolated note 
D4 (t (294) = 5.96, p < 0.0001), whereas the much smaller difference between D5 
and the cluster CDE5 was not significant. Even considering the possible effect of 
learning between part A and B (average sensitivity at pitches AO and Al was 0.23 
higher in part B), the result suggests that at D4, playing a cluster of notes facilitates 
vibration detection. 


4.3.4 Vibration Characterization 


In order to gain further insight into the results, vibration signals at the keyboard were 
measured on both the grand and upright Disklaviers. 

An in-depth description of the measurements and related issues is given in 
Sect. 13.3.2.2. For convenience, only essential details are reported here. Vibration 
signals were acquired for different MIDI velocities at each of the 88 keys of the 
Disklavier pianos via a measurement accelerometer and recorded as audio signals. A 
digital audio sequencer software was used to record vibration signals, while reproduc- 
ing MIDI tracks that played back each single key of the Disklaviers. Additional MIDI 
tracks were used to play CDE4 and CDES clusters, while vibration was recorded with 
the accelerometer attached to the respective C, D, and E keys in sequence. The MIDI 
velocities were chosen to cover the entire dynamics reproducible by the Disklaviers’ 
motors. 

Acceleration signals had a large onset in the attack, corresponding to the initial fly 
of the keys followed by their impact against the keybed. Figure 4.4 shows a typical 
attack, recorded from the grand Disklavier playing the A2 note at MIDI velocity 
12. These onsets, appearing in the first 200-250 ms, are not related to the vibratory 
response of the keys and were therefore manually removed from the samples. 

Acceleration values in m/s? were computed from the acquired signals by making 
use of the nominal sensitivity parameters of the audio interface and the accelerom- 
eter. Similarly to what was done by Askenfelt and Jansson [4], the spectra of 
the resulting acceleration signals were compared to Verrillo’s reference vibrotac- 
tile sensitivity curve [55]. Note that this curve reports sensitivity as the small- 
est, frequency-dependent displacement A(f) (in meters) of a sinusoidal stimulus 
s(t) = A(f) sin@z ft) that is detected at the fingertip. Therefore, a correspond- 
ing acceleration curve was computed from the original displacement curve in order 
to compare with our acceleration signals. Thanks to the sinusoidal nature of the 
stimuli employed by Verrillo, the corresponding acceleration signal could be found 
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Fig. 4.4 Attack of the 
acceleration signal recorded 
for note A2, MIDI velocity 
12, grand Disklavier. Figure 
reprinted from [19] 


Amplitude (digital values) 


0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 
Time (s) 

Fig. 4.5 Vibration spectrum 160 
of AO played with ff 
dynamics (MIDI velocity 140} 
111) on the upright T 
Disklavier, represented as D, 120 
magnitude acceleration in 3 
dB. The vertical dotted line = 100 
shows the nominal F 
fundamental frequency = 80 
f0 = 27.5 Hz. The dashed = 

: : © 
curve represents vibrotactile o p 
acceleration thresholds at the S 
fingertip adapted from <=. 40 
Verrillo [55]. Figure 
reprinted from [19] 20 | 

0 
10° 10° 


Frequency [Hz] 


analytically as s(t) = —A(f)- 2af y sin(27 ft). Consequently, the acceleration 
threshold curve A(f) - (27 f)? was used for comparison to our signals. Confirming 
the results by Askenfelt and Jansson [4], no spectral peaks were found to exceed the 
acceleration threshold curve, even for notes played with high dynamics. To exemplify 
this, Fig. 4.5 shows the spectrum of the highest dynamics of the note that participants 
detected with the highest sensitivity (part A), i.e., AO played at MIDI velocity 111, 
along with the threshold curve. 

Since Verrillo’s thresholds cannot explain the results of Experiment 2, RMS accel- 
eration values were computed in place of spectral peak amplitudes, in analogy with 
Experiment 1. Vibration signals were first processed with a specifically designed 
low-pass filter to shape stimuli according to human vibrotactile band [19]. RMS 
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Fig. 4.6 RMS acceleration values of keys played as in part A (top) or parts A and B (bottom). The 
horizontal lines represent (min/max) vibrotactile thresholds as measured in Experiment 1 for noise 
and sinusoidal stimuli over a range of active pressing forces. Figure adapted from [19] 


values in dB were then extracted from the filtered signals over time windows equal 
to the lengths of the stimuli, that is 1 s for short and 4s for long trials. Figure 4.6 
shows the resulting RMS values for parts A and B, respectively, together with the 
RMS thresholds of vibration reported in Experiment 1. A comparison of the RMS 
acceleration values and perceptual thresholds for noise shown in these figures against 
the sensitivity curves of Fig. 4.3 suggests that RMS values of broadband stimuli have 
more potential to explain the results of Experiment 2. 
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4.3.5 Discussion 


The results presented in the previous section show that sensitivity to key vibrations 
is highest in the lowest range and decreases toward higher pitches. Vibrations are 
clearly detected in many cases where the vibration acceleration signals hardly reached 
typical thresholds found in the literature for sinusoidal stimuli. 

The literature on the detection of complex stimuli provides support to our results, 
although it does not explain them completely. As already discussed in Sect. 4.2.4, 
Wyse et al. [61] report RMS acceleration threshold values at 250 Hz corresponding 
to 80 dB, a value compatible with our results. However, the characteristics of those 
stimuli may have occasionally produced significant energy at lower frequencies, 
causing the thresholds to lower once they were presented to the whole hand. 

The pianist receives the initial transient when the hammer hits the string; then, 
the vibration energy promptly decreases and its partials fade each with its own decay 
curve. The initial peak may produce an enhancement effect similar to those mea- 
sured by Verrillo and Gescheider limited to sinusoids [56] and hence contribute to 
sensitivity. 

As discussed earlier, the P-channel is sensitive to the signal energy, while is not 
able to recognize complex waveforms. Loudness summation instead occurs when 
vibration stimulates both the Pacinian and non-Pacinian (NP) channels, lowering the 
thresholds accordingly [7, 37, 56]. In our experiment, summation effects were likely 
to occur when the AO key and, possibly, the Al key were pressed. From A3 on, 
only the P-channel became responsible for vibration perception. Figure 4.3 seems to 
confirm these conclusions, since they show a pronounced drop in sensitivity between 
A1 and A3 in both parts of Experiment 2. As Fig. 4.6 demonstrates, this drop is only 
partially motivated by a proportional attenuation of the vibration energy in the grand 
piano, while it is not motivated at all in the upright piano. Hence, it is reasonable to 
conclude that the NP-channel played a perceptual role until A3. Beyond that pitch, 
loudness summation effects ceased. 

In analogy with Experiment 1, the results of this experiment also suggest the 
occurrence of spatial summation effects [10] when a cluster of notes, whose funda- 
mentals overlap with the tactile band, is played instead of single notes. As Fig. 4.3 
(bottom panel) shows, playing the cluster in the fourth octave boosted the detection 
in that octave, whereas the same effect did not occur in the fifth octave. Unlike Exper- 
iment 1, this summation originates from multifinger interaction rather than varying 
contact areas in single-finger interaction. This evidence opens an interesting ques- 
tion about the interaction of complex vibrations reaching the fingers simultaneously. 
Measurements of cutaneous vibration propagation patterns in the hand resulting from 
finger tapping show, however, an increase in both intensity and propagation distance 
with the number of fingers involved [49], which may partially explain the increased 
sensitivity we observed. 
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Unlike Experiment 1, where uni-modal tactile stimuli were used, here we 
employed bimodal audio-tactile stimuli. Therefore, the possibility of cross-modal 
amplification effects needs to be shortly discussed, even though Experiment 2 did 
not investigate this aspect. As discussed earlier, previous studies on cross-modal 
integration effects [46, 58] support the concrete possibility that an audible piano 
tone, whose vibratory components are a subset of the auditory components, helps 
detect a tactile signal near threshold. Although in our case the sound came from a 
synthesizer, both the auditory and tactile signals shared the same fundamental fre- 
quency of the piano tone, and furthermore the first partials were close to each other, 
respecting the hypothesis of proximity in frequency investigated by Wilson [58]. We 
did not test a condition in which subjects played the piano in normal mode in the 
absence of auditory feedback, or using sound uncorrelated with vibration (e.g., white 
noise). Although that may provide significant data about the effective contribution 
of auditory cues to vibration detection on the piano, a different experimental setup 
should be devised. Other cross-modal effects that may have instead contributed to 
impair the detection [62] should be considered as minor with respect to the spectral 
compatibility and temporal synchronization of the audio-tactile stimulus occurring 
when a piano key was pressed. 

Yet another relevant difference with Experiment 1 is that in this case the pressing 
forces exerted by pianists were unknown and most likely not constant throughout 
a single trial. The maximum and minimum sensitivity thresholds lines in Fig. 4.6, 
which report the results of Sect. 4.2.3, correspond to constant pressing forces of 
1.9 and 8N for noise vibration, and 1.9 and 15 N for 250 Hz sinusoidal vibration. 
These force values occur when piano keys are hit at dynamics between pp and f, with 
negligible difference between struck and pressed touch style [20, 33]. Conversely, ff 
dynamics require stronger forces up to 50 N [4]. In Experiment 2, it seems reasonable 
to assume that pianists initially pressed the key according to the dynamics required 
by the trial and then, once the key had reached the keybed, accommodated the finger 
force on acomfortable value while attending the detection process. If our participants 
adapted finger forces toward the range mentioned above, then their performance in 
this experiment would fall in between the results for sinusoidal and noise stimuli in 
Experiment 1. Experiment 1 additionally found that, when using low finger force, 
musicians on average exhibit slightly better tactile acuity than non-musicians. Even 
if this difference was not significant, our participants could have reduced the finger 
force only after starting a trial that required loud dynamics, while leaving the force 
substantially unvaried during the entire task in the other cases. This behavior seems 
indeed quite natural. 

The hypothesis that vibrotactile sensitivity to RMS acceleration falls in between 
the thresholds for 250 Hz sine wave and filtered noise is coherent with the tempo- 
ral and spectral characteristics of the stimuli: Right after its initial transient, a piano 
tone closely resembles a decaying noisy sinusoid. For instance, it can be simulated by 
employing several hundreds of damped oscillators whose outputs are subsequently 
filtered using a high-order transfer characteristic [5]. A remaining question is whether 
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the RMS acceleration values of filtered noise plotted in Fig. 4.6 explain our thresh- 
olds sufficiently, or if there is a need to discuss them further. Other elements in 
favor of further discussion are the mentioned potential existence of a cross-modal 
amplification and evidences of superior tactile acuity in musicians [65]. 


4.4 Conclusions 


We have given an introduction to the role of active touch in musical haptic research. 
A closed loop between musicians and their instrument during performance poses a 
major challenge to experimental setups: While playing, musicians generate them- 
selves the vibrotactile feedback and are at the same time influenced by the produced 
sound. To discuss the possible links between music performance tasks and basic 
active touch psychophysics, we presented two experiments, one in a controlled and 
one in an ecological setting, showing evidence that pianists perceive keyboard vibra- 
tions with sensitivity values resembling those obtained under controlled active touch 
conditions. Overall, the results presented here suggest that research on active touch 
in musical performance may prove precious to understand the role, mechanisms, and 
prospective applications of active touch perception also outside the musical context. 
An example application that seems at immediate reach of current tactile interfaces 
is to create illusory effects of loudness change by varying the intensity of vibratory 
feedback [39, 42]. 

Although interesting and necessary, our results represent only a premise for further 
research activities aimed at precisely understanding the role of tactile feedback during 
piano playing. Exploratory experiments have already been performed in an attempt to 
understand whether changes in the “timbre” of tactile feedback may determine equiv- 
alent auditory sensations. Some results in this regard are presented in Sect. 5.3.2.2. 
If confirmed, after excluding the influence of non-airborne sonic cues on auditory 
perception, such results would imply the ability of the tactile and auditory systems to 
interact so as to form a wider, multimodal notion of musical timbre, for which some 
partial evidence has been found in musicians [59] and non-musicians [47]. Several 
questions related to the role of tactile feedback in musical performance remain open. 
For instance, feedback from percussion instruments is likely to define strong pat- 
terns of skin vibration extending far beyond the interaction point. The propagation 
of vibration across the skin has been recent object of research having potentially inter- 
esting haptic applications outside the musical context [49]. It cannot be excluded that 
percussionists control their playing by testing specific wide-area tactile patterns they 
learned, and then retained in the somatosensory memory after years of practice with 
their instrument: Some sense of unnatural interaction with the instrument otherwise 
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should not be experienced by drummers and percussionists when they play rubber 
pads and other digital interfaces. Furthermore, while it is not precisely known how 
wind instrument players make use of the vibrations transmitted by the mouthpiece, 
digital wind controllers like the Yamaha WX series never achieved wide popularity, 
possibly also due to their unnatural haptic feedback. 


Acknowledgements The authors wish to thank Francesco and Valerio Zanini for recording piano 
vibrations and contributing to perform the piano experiment. This research was pursued as part of 
project AHMI (Audio-Haptic modalities in Musical Interfaces, 2014-2016), funded by the Swiss 
National Science Foundation. 


References 


14. 


15. 


16. 


. Aaserud, O., Juntunen, J., Matikainen, E.: Vibration sensitivity thresholds: methodological 


considerations. Acta Neurologica Scandinavica 82, 277-283 (1990) 


. Aatola, S., Farkkila, M., Pyykkö, I., Korhonen, O.: Measuring method of vibration perception 


threshold of fingers and its application to vibration exposed workers. Int. Arch. Occup. Environ. 
Health 62, 239-242 (1990) 


. Altinsoy, M.E., Merchel, S., Tilsch, S.: Perceptual evaluation of violin vibrations and audio- 


tactile interaction. Proc. Meet. Acoust. 19(1), 15-26 (2013) 


. Askenfelt, A., Jansson, E.V.: On vibration sensation and finger touch in stringed instrument 


playing. Music Percept. 9(3), 311-349 (1992) 


. Bank, B., Zambon, S., Fontana, F.: A modal-based real-time piano synthesizer. IEEE Trans. 


Audio Speech Lang. Process. 18(4), 809-821 (2010) (Special Issue on Virtual Analog Audio 
Effects and Musical Instruments) 


. Bensmaia, S., Hollins, M., Yau, J.: Vibrotactile intensity and frequency information in the 


pacinian system: a psychophysical model. Percept. Psychophys. 67(5), 828-841 (2005) 


. Bensmaia, S.J., Hollins, M.: Complex tactile waveform discrimination. J. Acoust. Soc. Am. 


108(3), 1236-1245 (2000) 


. Bolanowski, S.J., Gescheider, G.A., Verrillo, R.T., Checkosky, C.M.: Four channels mediate 


the mechanical aspects of touch. J. Acoust. Soc. Am. 84(5), 1680-1694 (1988) 


. Brisben, A.J., Hsiao, S.S., Johnson, K.O.: Detection of vibration transmitted through an object 


grasped in the hand. J. Neurophysiol. 81(4), 1548-1558 (1999) 


. Craig, J.C., Sherrick, C.E.: The role of skin coupling in the determination of vibrotactile spatial 


summation. Percept. Psychophys. 6(2), 97-101 (1969) 


. Dahl, S.: Striking movements: a survey of motion analysis of percussionists. Acoust. Sci. 


Technol. 32(5), 168-173 (2011) 


. Dauman, R.: Bone conduction: an explanation for this phenomenon comprising complex mech- 


anisms. Eur. Ann. Otorhinolaryngol. Head Neck Dis. 130(4), 209-213 (2013) 


. DiDomenico Astin, A.: Finger force capability: measurement and prediction using anthro- 


pometric and myoelectric measures. Master’s thesis, Virginia Polytechnic Institute and State 
University, Blacksburg, VA, USA (1999) 

Dinse, H.R., Kalisch, T., Ragert, P., Pleger, B., Schwenkreis, P., Tegenthoff, M.: Improving 
human haptic performance in normal and impaired human populations through unattended 
activation-based learning. ACM Trans. Appl. Percept. 2(2), 71-88 (2005) 

Ekenvall, L., Gemne, G., Tegner, R.: Correspondence between neurological symptoms and 
outcome of quantitative sensory testing in the hand-arm vibration syndrome. Br. J. Ind. Med, 
46, 570-574 (1989) 

Fontana, F., Avanzini, F., Järveläinen, H., Papetti, S., Klauer, G., Malavolta, L.: Rendering and 
subjective evaluation of real vs. synthetic vibrotactile cues on a digital piano keyboard. In: 


70 


17. 


18. 


19. 


20. 


21. 
22. 
23. 
24. 
25. 


26. 
27. 
28. 


29. 


30. 


31. 


32. 


33. 


34. 


35; 


36. 


3T. 


38. 


39. 


F. Fontana et al. 


Proceedings of the Sound and Music Computing conference (SMC), Maynooth, Ireland, pp. 
161-167 (2015) 

Fontana, F., Avanzini, F., Järveläinen, H., Papetti, S., Zanini, F., Zanini, V.: Perception of 
interactive vibrotactile cues on the acoustic grand and upright piano. In: Proceedings of the 
Sound and Music Computing Conference (SMC), Athens, Greece (2014) 

Fontana, F., Papetti, S., Civolani, M., del Bello, V., Bank, B.: An exploration on the influence 
of vibrotactile cues during digital piano playing. In: Proceedings of the Sound and Music 
Computing conference (SMC), Padua, Italy (2011) 

Fontana, F., Papetti, S., Järveläinen, H., Avanzini, F.: Detection of keyboard vibrations and 
effects on perceived piano quality. J. Acoust. Soc. Am. 142(5), 2953-2967 (2017) 

Galembo, A., Askenfelt, A.: Quality assessment of musical instruments - Effects of multi- 
modality. In: Proceedings of the 5th Triennial Conference of the European Society for the 
Cognitive Sciences of Music (ESCOMS), Hannover, Germany (2003) 

Gescheider, G., Bolanowski, S., Verrillo, R.: Some characteristics of tactile channels. Behav. 
Brain Res. 148(1—2), 35—40 (2004) 

Gescheider, G.A., Bolanowski, S.J., Verrillo, R.T., Arpajian, D.J., Ryan, T.F.: Vibrotactile 
intensity discrimination measured by three methods. J. Acoust. Soc. Am. 87(1), 330 (1990) 
Gescheider, G.A., Joelson, J.M.: Vibrotactile temporal summation for threshold and 
suprathreshold levels of stimulation. Percept. Psychophys. 33(2), 156-162 (1983) 

Gibson, J.J.: Observations on active touch. Psychol. Rev. 69, 477-491 (1962) 

Giordano, B.L., Avanzini, F., Wanderley, M.M., McAdams, S.: Multisensory integration in per- 
cussion performance. In: S.F. d’ Acoustique SFA (ed.) 10ème Congrès Francais d’ Acoustique, 
Lyon, France (2010) 

Green, D., Swets, J.: Signal Detection Theory and Psychophysics. Wiley, New York (1966) 
Guizzo, E.: Keyboard maestro. IEEE Spectr. 47(2), 32-33 (2010) 

Harada, N., Griffin, M.J.: Factors influencing vibration sense thresholds used to assess occu- 
pational exposures to hand transmitted vibration. Br. J. Ind. Med. 48(48), 185-192 (1991) 
Harazin, B., Kuprowski, J., Stolorz, G.: Repeatability of vibrotactile perception thresholds 
obtained with two different measuring systems. Int. J. Occup. Med. Environ. Health 16(4), 
311-319 (2003) 

Kayser, C., Petkov, C.I., Augath, M., Logothetis, N.K.: Integration of touch and sound in 
auditory cortex. Neuron 48(2), 373-384 (2005) 

Keane, M.: Separation of piano keyboard vibrations into tonal and broadband components. 
Appl. Acoust. 68(10), 1104-1117 (2007) 

Keane, M., Dodd, G.: Subjective assessment of upright piano key vibrations. Acta Acustunited 
Ac. 97(4), 708-713 (2011) 

Kinoshita, H., Furuya, S., Aoki, T., Altenmiiller, E.: Loudness control in pianists as exemplified 
in keystroke force measurements on different touches. J. Acoust. Soc. Am. 121(5), 2959-2969 
(2007) 

Lamoré, P.J., Keemink, C.J.: Evidence for different types of mechanoreceptors from measure- 
ments of the psychophysical threshold for vibrations under different stimulation conditions. J. 
Acoust. Soc. Am. 83(6), 2339-2351 (1988) 

Levitt, H.: Transformed up-down methods in psychoacoustics. J. Acoust. Soc. Am. 49(2), 
467-477 (1971) 

Maeda, S., Griffin, M.J.: A comparison of vibrotactile thresholds on the finger obtained with 
different equipment. Ergonomics 37(8), 1391-1406 (1994) 

Makous, J., Friedman, R., Vierck, C.: A critical band filter in touch. J. Neurosci. 15(4), 2808— 
2818 (1995) 

Marshall, K., Genter, B.: The musician and the vibrational behavior of a violin. J. Catgut 
Acoust. Soc. 45, 28-33 (1986) 

Merchel, S., Leppin, A., Altinsoy, E.: Hearing with your body: the influence of whole-body 
vibrations on loudness perception. In: Proceedings of the 16th International Congress on Sound 
and Vibration (ICSV), Kraków, Poland (2009) 


4 Perception of Vibrotactile Cues in Musical Performance 71 


40. 


41. 


42. 


43. 


44. 


45. 


46. 


47. 


48. 


49. 


50. 


51. 


52. 


63. 


Morey, R.D.: Confidence intervals from normalized data: a correction to cousineau (2005). 
Tutor. Quant. Methods Psychol. 4(2), 61—64 (2008) 

Morioka, M., Griffin, M.J.: Dependence of vibrotactile thresholds on the psychophysical mea- 
surement method. Int. Arch. Occup. Environ. Health 75(1—2), 78-84 (2002) 

Okazaki, R., Kajimoto, H., Hayward, V.: Vibrotactile stimulation can affect auditory loudness: 
a pilot study. In: Proceedings of the Eurohaptics Conference, Tampere, Finland, pp. 103-108 
(2012) 

Papetti, S., Järveläinen, H., Giordano, B.L., Schiesser, S., Fröhlich, M.: Vibrotactile sensitivity 
in active touch: effect of pressing force. IEEE Trans. Haptics 10(1), 113—122 (2017) 

Papetti, S., Järveläinen, H., Schmid, G.M.: Vibrotactile sensitivity in active finger pressing. In: 
Proceedings of the IEEE World Haptics, Evanston, Illinois, USA (2015) 

Ragert, P., Schmidt, A., Altenmüller, E., Dinse, H.R.: Superior tactile performance and learning 
in professional pianists: evidence for meta-plasticity in musicians. Eur. J. Neurosci. 19(2), 473- 
478 (2004) 

Ro, T., Hsu, J., Yasar, N.E., Elmore, L.C., Beauchamp, M.S.: Sound enhances touch perception. 
Exp. Brain Res. 195(1), 135-143 (2009) 

Russo, F., Ammirante, P., Fels, D.: Vibrotactile discrimination of musical timbre. J. Exp. Psy- 
chol. Hum. Percept. Perform. 38(4), 822-826 (2012) 

Saitis, C., Fritz, C., Scavone, G.P., Guastavino, C., Dubois, D.: Perceptual evaluation of violins: 
A psycholinguistic analysis of preference verbal descriptions by experienced musicians. J. 
Acoust. Soc. Am. 141(4), 2746-2757 (2017) 

Shao, Y., Hayward, V., Visell, Y.: Spatial patterns of cutaneous vibration during whole-hand 
haptic interactions. Proc. Natl. Acad. Sci. U.S.A 113(15), 4188-4193 (2016) 

Soto-Faraco, S., Deco, G.: Multisensory contributions to the perception of vibrotactile events. 
Behav. Brain Res. 196(2), 145-154 (2009) 

Suzuki, H.: Vibration and sound radiation of a piano soundboard. J. Acoust. Soc. Am. 80(6), 
1573-1582 (1986) 

Van Doorn, G.H., Dubaj, V., Wuillemin, D.B., Richardson, B.L., Symmons, M.A.: Cognitive 
load can explain differences in active and passive Touch. In: Isokoski, P., Springare, J. (eds.) 
Haptics Perception, Devices, Mobility, Commun., Lecture Notes in Computer Science, vol. 
7282, pp. 91-102. Springer, Berlin, Heidelberg (2012) 


. Verrillo, R.T.: Temporal summation in vibrotactile sensitivity. J. Acoust. Soc. Am. 37, 843-846 


(1965) 


. Verrillo, R.T.: Psychophysics of vibrotactile stimulation. J. Acoust. Soc. Am. 77(1), 225-232 


(1985) 


. Verrillo, R.T.: Vibration sensation in humans. Music Percept. 9(3), 281-302 (1992) 
. Verrillo, R.T., Gescheider, G.A.: Enhancement and summation in the perception of two suc- 


cessive vibrotactile stimuli. Percept. Psychophys. 18(2), 128-136 (1975) 


. Wilson, E.C., Braida, L.D., Reed, C.M.: Perceptual interactions in the loudness of combined 


auditory and vibrotactile stimuli. J. Acoust. Soc. Am. 127(5), 3038-3043 (2010) 


. Wilson, E.C., Reed, C.M., Braida, L.D.: Integration of auditory and vibrotactile stimuli: effects 


of phase and stimulus-onset asynchrony. J. Acoust. Soc. Am. 126(4), 1960-1974 (2009) 


. Wollman, I., Fritz, C., Poitevineau, J.: Influence of vibrotactile feedback on some perceptual 


features of violins. J. Acoust. Soc. Am. 136(2), 910-921 (2014) 


. Woodhouse, J.: The acoustics of the violin: a review. Rep. Prog. Phys. 77(11), 115901 (2014) 
. Wyse, L., Nanayakkara, S., Seekings, P., Ong, S.H., Taylor, E.A.: Palm-area sensitivity to 


vibrotactile stimuli above 1 kHz. In: Proceedings of the Conference on New Interfaces for 
Musical Expression (NIME), Ann Arbor, MI, USA (2012) 


. Yau, J.M., Olenczak, J.B., Dammann, J.F., Bensmaia, S.J.: Temporal frequency channels are 


linked across audition and touch. Curr. Biol. 19(7), 561-566 (2009) 
Yildiz, M.Z., Toker, I., Ozkan, F.B., Güçlü, B.: Effects of passive and active movement on 


vibrotactile detection thresholds of the Pacinian channel and forward masking. Somatosens. 
Mot. Res. 32(4), 262-272 (2015) 


72 F. Fontana et al. 


64. Young, G.W., Murphy, D., Weeter, J.: Auditory discrimination of pure and complex waveforms 
combined with vibrotactile feedback. In: Proceedings of the Conference on New Interfaces for 
Musical Expression (NIME), Baton Rouge, LA, USA (2015) 

65. Zamorano, A.M., Riquelme, I., Kleber, B., Altenmiiller, E., Hatem, S.M., Montoya, P.: Pain 
sensitivity and tactile spatial acuity are altered in healthy musicians as in chronic pain patients. 
Front. Hum. Neurosci. 8, 1016 (2014) 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate 
credit to the original author(s) and the source, provide a link to the Creative Commons license and 
indicate if changes were made. 

The images or other third party material in this chapter are included in the chapter’s Creative 
Commons license, unless indicated otherwise in a credit line to the material. If material is not 
included in the chapter’s Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from 
the copyright holder. 


Chapter 5 A) 
The Role of Haptic Cues in Musical get 
Instrument Quality Perception 


Charalampos Saitis, Hanna Järveläinen and Claudia Fritz 


Abstract We draw from recent research in violin quality evaluation and piano per- 
formance to examine whether the vibrotactile sensation felt when playing a musical 
instrument can have a perceptual effect on its judged quality from the perspective 
of the musician. Because of their respective sound production mechanisms, the vio- 
lin and the piano offer unique example cases and diverse scenarios to study tactile 
aspects of musical interaction. Both violinists and pianists experience rich haptic 
feedback, but the former experience vibrations at more bodily parts than the latter. 
We observe that the vibrotactile component of the haptic feedback during playing, 
both for the violin and the piano, provides an important part of the integrated sensory 
information that the musician experiences when interacting with the instrument. In 
particular, the most recent studies illustrate that vibrations felt at the fingertips (left 
hand only for the violinist) can lead to an increase in perceived sound loudness and 
richness, suggesting the potential for more research in this direction. 


5.1 Introduction 


Practicing a musical instrument is a rich multisensory experience. As explained in 
Chap. 2, the instrument and player form a complex system of sensory-motor inter- 
actions where the sensory feedback provided by the instrument as a response to a 
playing action (bowing, plucking, striking, blowing, pumping, rubbing, fingering) is 
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shaped not only by listening to the sound produced by that action, but also by feeling 
the cutaneous vibrations (vibrotactile sensation) and reactive forces (proprioceptive 
sensation) resulting from the same action. In assessing the heard sound in terms of 
technical execution and expressive intention—pitch, timing, articulation, dynamics, 
timbre—the musician integrates additional haptic cues before the next sound is made 
in order to adjust their playing technique. In this sense, the perception and evaluation 
of the quality of a musical instrument, as seen from the perspective of the performer, 
are a rich multisensory experience as well. 

The proprioceptive component of the haptic feedback at a musical instrument 
is connected to the behavior of the instrument’s (re)action. An instrument with a 
precise and responsive action allows a skilled musician to produce a wide variety of 
timbre nuances through fine-grained control of synchrony, dynamics, attack speed, 
articulation, and balance in polyphonic texture. Vibrotactile feedback, on the other 
hand, consists essentially of the same oscillations that the instrument body radiates 
as sound [42, 49, 69-71] and is perceived simultaneously with the auditory signal, 
but differently [4, 6, 18, 25, 31, 41, 45, 62, 65]. In contrast to hearing, where 
maximal sensitivity is in the range of 3000—4000 Hz, vibrotaction is most sensitive 
in the vicinity of 250 Hz (see Sect. 4.2), which is within the range of most orchestral 
instruments and already at about 1000 Hz the sensation of vibrations is lost, whereas 
the range of most instruments extends well beyond this frequency. Tactile waveforms 
of varying type and complexity can be discriminated [1, 8, 51, 59, 72] and can activate 
areas of the auditory cortex in the absence of sound input [14]. Auditory and tactile 
frequency is likely calculated in an integrated fashion during preattentive sensory- 
perceptual processing—much earlier in the information processing chain than had 
been supposed [13]. An overview of further comparisons between the auditory and 
tactile modalities is given in Sect. 12.2. But is the vibrotactile sensation at a musical 
instrument perceptually relevant to its judged quality? 

In the first part of this chapter, we will review recent research on the perceptual 
evaluation of violin quality from the perspective of the musician. Haptic feedback 
is particularly relevant in playing an instrument such as the violin where physical 
contact with the performer is highly intimate compared to other instruments due to 
the violin’s sound making mechanism. The fingers, chin, and shoulder of the violinist 
are in immediate contact with the vibrating parts of the instrument, implying a rich 
source of haptic feedback, an understanding of which should help to reveal particular 
aspects of quality perception. We will initially discuss psycholinguistic evidence of 
how violin quality is conceptualized in the mind of the violinist during playing-based 
preference tasks and then describe a series of studies on the perception and quality 
evaluation effects of vibrotactile feedback at the left hand of the violinist in normal 
playing scenarios. 

Alongside the violin, we have chosen the piano as a second example case. Here, the 
contact between the performer and the instrument is much less intimate compared 
to the violin. Traditional piano playing involves touching only the keys (modern 
piano repertoire may sometimes require hitting or plucking the strings) and pedals 
(mediated by shoes). The nature and origin of piano touch have long been a source 
of fundamental disagreement in music performance and perception research: Are the 


5 The Role of Haptic Cues in Musical Instrument Quality Perception 75 


timbre and loudness of a single note determined solely by the velocity of the hammer, 
or can the pianist further control them through the type of touch? In the second part of 
this chapter, we will then review recent literature on haptic feedback when playing the 
piano, examining the relationship between touch and tone quality, and more generally 
the importance of vibrotactile feedback to the perceptual evaluation of piano quality 
by the performer. 


5.2 Violin 


The violin as we know it today was developed in the early sixteenth century around 
Cremona in Italy and can be seen as the result of applying the tuning of the medieval 
rebec (fifths) to the body of the lira da braccio [16]. The transition from baroque to 
classical music led to a few further modifications in the second half of the eighteenth 
century, such as a longer, narrower fingerboard, and neck. Since then, the basic violin 
lutherie has remained largely unchanged, combining visual charm with ergonomics 
and a precise acoustical function. 

Sound is produced by bowing (or plucking) one or more strings at a location 
between the bridge and the edge of the fingerboard. The played string produces oscil- 
lations that are not efficiently radiated by the string itself due to its much smaller 
diameter than the acoustic wavelength of most audible frequencies [23]. Instead, 
the forces exerted from the vibrating string on the bridge cause the violin body to 
vibrate and thus radiate sound. The varying patterns in which different harmon- 
ics are transformed by the vibrating modes (resonances) of the body thus “color” 
the radiated sound. Figure 5.1 depicts a typical violin frequency response function 
(defined as the input admittance measured at the E-string notch on the bridge). Fur- 
thermore, violin body resonances exhibit a slow decay that brings a “ringing” quality 
to the sound [37]. At frequencies above about | kHz, the motions of the body cre- 
ate frequency-dependent directivity formations that add “flashing brilliance” to its 
sound [64]. 


5.2.1 Touch and the Conceptualization of Violin Quality 
by Musicians 


Attempts to quantify the characteristics of “good” and “bad” violins from vibrational 
measurements such as the input admittance (Fig.5.1) and/or listening tests have 
largely been inconclusive (see [52] for a review). On the one hand, this may be due 
in part to overly broad characterizations of “good” and “bad.” On the other hand, 
both approaches end up considering the instrument isolated from the musician and no 
haptic information is provided. Woodhouse was among the first to consider that what 
distinguishes one violin from another lies not only in its perceived sound quality but 
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Fig. 5.1 Input admittance of a violin obtained by exciting the G-string corner of the bridge with 
a miniature force hammer and measuring the velocity at the E-string corner of the bridge with a 
laser Doppler vibrometer [52]. The magnitude and phase are shown in the top and bottom plots, 
respectively. Some of the so-called signature modes (i.e., strongly radiating and thus crucial to violin 
sound) can be observed in the open string region, below about 600 Hz: the Helmholtz-type cavity 
mode AO at around 280Hz and the first strongly radiating corpus bending mode B1* just above 
500 Hz. Also, important is the hill-like collection of peaks known as the “BH peak” (bridge and/or 
body hill) in the vicinity of 2-2.5 kHz, which allows a solo violin to be heard over an ensemble of 
instruments 


also in what he termed its playability, as in how the violinist “feels” the instrument 
and how easy it is to produce a good sound [68]. To this end, recent research on 
violin acoustics and quality has focused attention on the perceptual and cognitive 
processes involved when violinists assess violins under normal playing scenarios. 

Fritz and colleagues carried out a series of listening tests using virtual vio- 
lins, whereby synthesized bridge-force signals were convolved with a digital filter 
mimicking the input admittance of the violin [29]. The measured admittance of 
a “good-quality” modern violin was first decomposed into its modal components, 
the parameters of which were then used to re-synthesize it, allowing for controlled 
variations of vibrato and body damping. Results showed that when listening to sin- 
gle notes, violinists found it difficult to assess the “liveliness” of the sound, and 
often, the word itself was not used in a consistent way across individuals. But when 
asked to play on an electric violin, whereby the actual bridge-force signal was passed 
through modified re-synthesized admittances in real time, musicians were able to rate 
liveliness consistently within and between individuals. This seems to suggest that 
liveliness is processed differently in passive listening versus active playing contexts, 
where haptic cues from proprioceptive and vibrotactile feedback are present. 

In another study, preference judgments made by three violin players during a 
listening and a playing test were compared in conjunction with psycholinguistic 
analyses of free-format verbal descriptions of musician experience provided by the 
three violinists [28]. The authors used a method from cognitive linguistics that relies 
on theoretical assumptions about cognitive-semantic categories and how they relate 
to natural language [20]. Categories can be thought of as collective representations 
and knowledge, to which individual assessments are conveyed by means of a shared 
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discourse. From what is being said (content analysis) and how it is being said (linguis- 
tic analysis), relevant inferences about how people process and conceptualize sen- 
sory experiences can be derived (semantic level) and further correlated with physical 
parameters (perceptual level). This approach has been applied to other instruments 
such as the piano [11] and the guitar [50], providing novel insights into how musicians 
perceive instrumental sound as well as playing characteristics. Fritz and colleagues 
found that the overall evaluation of a violin, as reflected in the verbal responses of the 
musicians, varied between listening and playing conditions, and the latter invoking 
linguistic expressions influenced not only from the produced sound but also by the 
physical interaction between the performer and the instrument. 

Saitis and colleagues carried out two violin playing perceptual tests based on a 
carefully controlled protocol [56, 57]. Emphasis was given to the design of conditions 
that are musically meaningful to the performer (e.g., playing versus listening, com- 
paring different instruments like in a violin workshop, using own bow, allowing time 
to familiarize with the different violins, developing own strategy). In the first exper- 
iment, skilled violinists ranked a set of different violins from least to most preferred. 
In the second experiment, another group of players rated a different set of violins 
according to specific attributes as well as preference. In both experiments, musicians 
were asked to verbally describe their choices through open-ended questions. Anal- 
yses of intra-individual consistency and inter-player agreement in the (nonverbal) 
preference and attribute judgments showed that while violinists generally agreed on 
what particular attributes they look for in an instrument, the perceptual evaluation of 
the same attributes varied dramatically across individuals, thus resulting in large inter- 
player differences in the preference for violins. A third experiment [58] and studies 
by Fritz et al. [26, 27] and Wollman et al. [66, 67] reached similar conclusions. 

To better understand the perceptual and cognitive processes involved when vio- 
linists evaluate violins, Saitis and colleagues further analyzed the verbal expressions 
collected in their two violin playing tests [53-55], expanding on an earlier work of 
Fritz et al. [28]. Based on psycholinguistic inferences, it was argued that violin play- 
ers of varying style and expertise share a common framework for conceptualizing 
violin quality on the basis of semantic features and psychological effects that inte- 
grate perceptual attributes (i.e., perceptual correlates of physical characteristics) of 
not only the sound produced but also the vibrotactile and proprioceptive sensations 
experienced when playing the instrument (Fig. 5.2). The bowed string and vibrating 
body system contribute to the perception of sound quality through (a) the amount of 
felt vibrations in the left hand, shoulder, and chin (conceptualized as resonance); (b) 
through assessing the offset (speed) and amount (ease) of reactive force (conceptu- 
alized as response) from the body in the right hand (through the bow) with respect to 
the quality and intensity of the heard as well as felt vibrations; and (c) through com- 
paring these between different notes across the instrument’s register (conceptualized 
as balance across strings). 

These psycholinguistic investigations provide empirical evidence that vibrations 
from the violin body and the bowed string (via the bow) are used by violinists as 
extra-auditory cues that not only help better control the played sound [4], but also 
contribute to a crossmodal audio-tactile assessment of its attributes. The perception of 
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Fig. 5.2 From body vibrations to semantic categories: a cognitive model describing how the per- 
ception of violin quality is elaborated on the basis of both auditory and haptic cues [55] 


violin sound quality is thus elaborated both from sensations linked to auditory infor- 
mation and from haptic factors associated with proprioceptive and vibrotactile cues. 
The cognitive model shown in Fig. 5.2 raises interesting questions concerning the 
characterization of haptic feedback in violin playing quality tests—what to measure 
and how? Can standard vibrational measurements, such as a violin’s bridge admit- 
tance (Fig. 5.1), capture everything significant about the reactive force and vibration 
levels felt by the player? If yes, in what ways can this information be extracted? 


5.2.2 Vibrotactile Feedback at the Left Hand 


Acoustics and psychophysics literature on the “feel” of a violin has been limited 
compared to the ample amount of research on the instrument’s sound. Marshall 
suggested that violin neck vibrations felt through the left hand form the basis for the 
perception of how a violin feels [43, 44]. He argued that the more often the left hand 
detects motions at antinodal parts of the neck (which are typically damped when the 
musician holds the violin but can be sensed directly on the skin), the more “alive” 
the violin will be felt. Askenfelt and Jansson showed that vibrations perpendicular 
to the side of the neck, measured on four violins of varying quality during playing 
a single note (lowest G, 196 Hz), were above or very close to vibration sensation 
thresholds measured at the fingertip under passive touch conditions by Verrillo [61]. 
However, no evidence was found that higher neck vibration intensity would result in 
judging a violin as being of better quality [4]. One limitation of that study was that 
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Fig. 5.3 Horizontal vibration levels at the side of the necks of violins (first position) perceived as 
either a “vibrating” or b “non-vibrating” (solid lines) and vibration sensation threshold at the left 
hand of violinists (dashed line). Reproduced from [65] with permission from S. Hirzel Verlag 


vibration amplitude was measured for five frequencies only, corresponding to the 
first five harmonics of the played note and thus lying below the 1 kHz upper limit of 
the human skin sensitivity range. Another potential issue—discussed in Sect. 4.3.4 
for the piano—is that Verrillo’s thresholds may not fully reflect actual vibration 
detection offsets when the left hand holds the neck of the violin (e.g., differences in 
location and size of contact area, pressure exerted from the hand on the neck). 
Wollman and colleagues were the first to systematically address the role of haptic 
cues from neck vibrations on violin quality perception. Expanding on the work of 
Askenfelt and Jansson [4], vibration levels were measured at the violin neck in first 
position! across a set of ten instruments, which were characterized by a professional 
violinist according to how “vibrating” they were felt to be [65]. Neck vibration fre- 
quency response curves of “vibrating” and “non-vibrating” violins, obtained across 
the whole range of the instrument through laser vibrometry, were then compared 
to absolute vibrotactile thresholds measured on fourteen violinists holding in first 
position a real isolated violin neck vibrating at six frequencies between 196 and 
800 Hz (the first four were chosen to correspond to the open strings). This setup 
helped obtain violin playing-specific thresholds (i.e., measured under active touch 
conditions, similar to what was done in Sect. 4.3 for the piano) that are more appro- 
priate to compare with vibration levels than those measured by Verrillo [61] and used 
by Askenfelt and Jansson. It was observed that while neck vibrations of “vibrating” 
violins were well above the detection threshold by an average of 15 dB in the range 
200-800 Hz, those of “non-vibrating” violins exhibited a steep attenuation of about 
40 dB around 600 Hz and stayed below or close to the threshold above that (Fig. 5.3). 
In another study [66], fifteen professional musicians listened to three violins while 
seating on a chair and holding a real isolated violin neck on which they fingered the 
performed score. The instruments were being played live by another violinist (non- 
participant) in the same room, placed behind a curtain in front of the participants. 


'“Position” refers to where the left hand is placed on the string. In the first position, the index 
presses the string at the scroll end of the fingerboard, which produces the next note (full tone) up 
from the open string (e.g., on the G string, first position corresponds to A). 
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Along with the live sound, vibrations of the played violins were picked up at the 
scroll using a small accelerometer and then transmitted through a shaker system to 
the isolated neck (Fig.5.4). They were presented either at the same level as in the 
played violin, reduced by half, or fully attenuated. This condition was described 
by the authors as active listening. Participants were asked to rate the violins on 
richness of sound, loudness, responsiveness, and pleasure of playing. It was observed 
that violinists judged all three violins as having a less loud but also a less rich 
sound whenever the level of vibrations felt on the isolated neck was reduced by 
half (Fig. 5.5). These results complemented the findings of Yau and colleagues, who 
have shown that in a non-musical context, the simultaneous presentation of tactile 
distractors can cause an increase in perceived tone loudness [71]. 

In a third experiment [67], twenty violinists evaluated five violins under three 
sensory masking conditions: playing without hearing the produced sound, playing 
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Fig. 5.4 Experimental setup for transmitting vibrations from the neck of a played violin to an 
isolated neck [66]. Reproduced with the permission of the Acoustical Society of America 
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without feeling the produced vibrations, and playing normally (i.e., neither modal- 
ity was masked). Auditory feedback was masked by means of earmuffs and in-ear 
monitors playing white noise with a bandwidth of 20—20000 Hz, while passive anti- 
vibration material was added to the chin rest to minimize bone conduction. Vibrations 
were primarily masked on the left hand using vibrating rings worn on the thumb, 
index, and ring fingers, while vibrations through the chin and shoulder rests were 
attenuated as in the auditory masking scenario. In each condition, musicians first 
rated each violin on a number of criteria related to perceived sound and playing char- 
acteristics and then commented on how relevant those criteria were each time. These 
data provided further evidence that the perceptual evaluation of violin attributes such 
as liveliness, power, evenness across the strings, or dynamic range relies not only on 
sonic information but also on vibrotactile cues. Concerning overall preferences, it was 
observed that removing auditory feedback was not more disruptive than attenuating 
felt vibrations, although its effect tended to depend on the instrument (Fig. 5.6). 

These studies indicate that the violin neck vibrations felt by violinist through the 
left hand can serve as an important cue to the concept of “feel” in violin quality 
evaluation, as well as augment the perception of qualities attributed to the sound (in 
that case “loud” and “rich”). They also introduce novel methods for characterizing 
vibrotactile feedback at the left hand. Another source of haptic cues that potentially 
relate to perceived “feel” and sound quality is the vibration of the chin rest. Askenfelt 
and Jansson argued that the jaw is less sensitive than the left hand, but it may still be 
possible for the violinist to sense these vibrations because of the larger contact area 
of the jaw with the chin rest [4]. Similarly to the violin neck, it would be interesting 
to investigate whether vibrotactile feedback at the chin contributes to the perception 
of a violin’s “feel” and/or sound. 
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5.3 Piano 


The modern piano, descending from the harpsichord and introduced by Bartolomeo 
Cristofori in 1709, evolved into two distinct types, the grand piano and the upright 
piano. The latter was developed in the middle of the nineteenth century, and its 
action differs somewhat from that of the first due to design constraints, although they 
share the same sound production principle [23]: A piano string is set in vibration 
when the respective key is depressed, a damper raised, and a felt hammer hits the 
string (Fig.5.7). String vibrations are transmitted through the bridge to the sound- 
board, from which the sound radiates into the air. Modal structure of the soundboard 
and material properties further contribute to the acoustics of the piano. The sound 
is characterized by different decay rates between partials [21], a two-part pattern 
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Fig. 5.7 Illustration of the function of the piano action at successive stages during a keystroke. a 
Rest position: The hammer rests via the hammer roller on the repetition lever, a part of the lever 
body. The lever body stands on the key, supported by the capstan screw. The weight of the hammer 
and lever body holds the playing end of the key in its upper position. The damper is resting on the 
string. b Acceleration: When the pianist’s finger depresses the key, the lever body is rotated upward. 
The jack, mounted on the lever body, pushes on the roller and accelerates the hammer. The damper 
is lifted off the string by the inner end of the key. c Let-off : The tail end of the jack is stopped by the 
escapement dolly, and the top of the jack is rotated away from the hammer roller. The free hammer 
continues toward the string. The repetition lever is stopped in waiting position by the drop screw. d 
Check: The rebounding hammer falls with the hammer roller on the repetition lever in front of the 
tripped jack. The hammer is captured at the hammer head by the check at the inner end of the key. 
Reprinted from [3] with the permission of the Acoustical Society of America 
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of time decay (or double decay) due to double and triple unison strings [63], and 
inharmonicity in terms of stretching of the partials due to string stiffness [22]. 


5.3.1 Piano Touch and Tone Quality 


There is a long-standing discrepancy between the acoustical basis of how the timbre 
of a single piano tone is created and the practical experience of piano performers [3, 
5]. When considering only the mechanics of the hammer-string interaction, piano 
timbre would be an instrument-specific result of loudness, which in turn depends 
on the velocity at which the hammer hits the string, controlled only through key 
velocity produced by the finger pressing force of the player. The way of touching 
the key would therefore have no influence on the resulting timbre. Skilled pianists, 
on the other hand, aim to control timbre and loudness independently through touch 
and gestural means involving movements of the entire upper body. A review on 
the historical development of various schools on piano technique as well as recent 
performance analysis and biomechanical studies on piano touch is presented by 
MacRitchie [40]. 

There is some evidence in favor of the touch effect, although it seems to be weaker 
than many pianists believe and mostly caused by other aspects of the sound than the 
tonal component. Goebl and colleagues measured the ability of pianists to perceive 
differences in piano sound independently of intensity [35]. Half of the participants 
were able to correctly distinguish between struck and pressed touch in the presence 
of finger-key noises occurring 20-200 ms before the sound. When the noises were 
cut from the sound signals, performance dropped to chance level. Pianists were also 
able to distinguish piano sounds of equal hammer velocity with either present or 
absent key-keybed noises with an average of 82% accuracy [34]. Askenfelt observed 
that structure-born transients, dependent on the type of touch and present 20-30 ms 
before the first transversal wave on the string arrives at the bridge, may potentially 
be connected with the pianist’s touch [2]. More recently, numerical simulations of 
the hammer head-shank interaction showed a difference in spectral profile between 
legato and staccato sounds in the range of 500-1000 Hz [17]; however, an effect 
on perceived timbre was not shown experimentally. Suzuki reported a slight spectral 
brightening for G5, in the order of 1.5 dB at the tenth partial, as a result of “hard” 
or “soft” touch depending on the degree of stiffness of shoulder, elbow, wrist, and 
finger [60]. When listening only, about half of the participants could distinguish an 
effect of similar degree after training. 

To discover how pianists achieve fine-grained control of their instrument’s sound, 
the way they describe and recognize timbre nuances in piano performance has gained 
interest. Bernays and Traube quantified a semantic space of five descriptors (dry, 
bright, round, velvety, and dark) [10] based on an analysis of free verbalizations 
provided by pianists [7] and conducted a series of studies where pianists performed 
pieces highlighting each of the five semantic dimensions of piano timbre. Despite 
differences between musicians relating to individual playing styles, common timbre 
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nuance strategies were revealed across different performances [11, 12]. The latter 
were Saliently grouped by the intended timbre on a bidimensional space by means of 
principal components analysis. The first component was found to be associated with 
dynamics, attack, and soft pedal features, while the second dimension was related 
to sustain pedal. Further playing style factors included key depression depth, legato 
versus staccato articulation, and balance between hands. 

Given the pianist’s common ways of nuance control, the question arises whether 
listeners can differentiate and identify the resulting timbres in piano performance. 
To this end, Bernays reported a pilot study where listeners both described freely and 
identified in a forced choice task the timbre of piano performance excerpts, each 
intended to reflect one of the following timbre nuances: bright, dark, distant, full- 
bodied, harsh, matte, round, and shimmering [9]. Participants identified the timbre 
categories above chance level except for round and matte. Some categories, like 
bright and shimmery, were frequently mixed up, probably due to their semantic 
proximity. 

These studies have revealed that pianists can control timbre independently of 
dynamics: The way of touching the keys produces differences in contact noises 
(finger-key, key-key bottom, and release sounds) as well as slight spectral effects. 
While these may be inaudible to the average listener, they have a stronger and more 
important effect on the skilled pianist due to sensory integration of the matching touch 
and sound information [15]. Especially in polyphonic touch, these subtle vibrotactile 
cues may enable the player to produce and control a wide range of timbre nuances. 


5.3.2 Haptic Cues and Instrument Quality 


Some early experiments on multimodal perception of piano quality were conducted 
by Galembo and Askenfelt [30], in which pianists evaluated four concert grand 
pianos under varying sensory feedback conditions. When freely playing the instru- 
ments, professional pianists ranked them as expected according to the manufacturers’ 
reputation. However, musicians failed to identify the pianos in a listening-only con- 
dition, nor was the resulting quality ranking equal to the playing-based evaluation. 
In a subsequent free playing task, where visual feedback was blocked by means of 
blindfolding, the musicians and auditory feedback was blocked through masking 
noise, the pianists were actually able to identify the instruments without difficulty. 
These experiments offer some evidence that pianos can be identified by their hap- 
tic response perhaps even better than by their sound. As an underlying mechanism, 
one should expect that different piano actions react differently to different dynamics 
and types of touch and that these differences are perceivable and possibly of more 
importance than auditory cues to the player. 

Askenfelt and Jansson had previously made timing measurements of the various 
parts of the piano action and observed differences mainly as a function of dynamics 
and regulation of the action (mechanical adjustments to compensate for the effects 
of wear) [3]. Goebl et al. [36] studied in detail the temporal behavior of three grand 
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piano actions. Touch-related differences were found through measurements of finger- 
key, hammer-string, and key-keybed contact times and maximum hammer velocities 
throughout the entire dynamic range for several keys. A different key velocity tra- 
jectory in struck and pressed sounds was also observed. Struck sounds showed two 
acceleration phases of key velocity, while the pressed sounds developed more lin- 
early. These differences between struck and pressed touch were observed in all three 
pianos that were measured. However, it remains unknown how the behavior of the 
piano action may affect the player experience. The authors of the study hypothesize 
that since the pianist needs to (unconsciously) estimate the path from touch to tone 
onset and intensity for various dynamics and types of touch, a high-quality instru- 
ment is one that has a precise and consistent action. In their own informal evaluation 
as pianists, the most highly appreciated instrument turned out to have the lowest 
compressibility of the parts of action, short free-travel times of the hammer, and late 
maxima in the hammer velocity trajectory. 


5.3.2.1 Vibrations in the Acoustic Piano 


Keane analyzed keyboard vibrations at four upright and four grand pianos by remov- 
ing harmonic peaks from the spectrum of the vibration signal and thus splitting it into 
tonal and broadband parts [38]. Similar tonal components were observed across the 
two piano types, but upright pianos showed a stronger broadband component, which 
could explain the generally lower perceived quality of upright versus grand pianos. 
In fact, a later study showed that pianists preferred the tone quality and loudness 
profile of an upright piano with attenuated broadband vibrations [39]. 

Fontana and colleagues investigated the effect of key vibrations on acoustic piano 
quality using both a grand and an upright Yamaha Disklavier, which can operate 
in both an acoustic and silent mode [25]. While playing, pianists received audi- 
tory feedback through a piano software synthesizer and tactile feedback through the 
Disklavier keyboard. The technical setup is described in more detail in Sect. 4.3.1. 
The experimental task involved comparing a non-vibrating to a vibrating piano setup 
during free playing according to several quality attributes. In the non-vibrating setup 
(A), the Disklavier was operating in silent mode, which prevents the hammers from 
hitting the strings and thus from producing vibrations. In the vibrating setup (B), 
the Disklavier was operating in acoustic mode, allowing the natural vibration of the 
strings to be transmitted to the soundboard as well as to the keys. However, the 
acoustically produced sound was blocked by insulating earmuffs placed on top of 
the earphones playing back the synthetic piano sound. Pianists rated the following 
attributes on a continuous scale ranging from —3 (“A much better than B”) to +3 (“B 
much better than A”): dynamic range, loudness, richness, naturalness, and prefer- 
ence. All attributes except preference were rated separately in the low (keys below 
D3), mid (keys between D3 and A5), and high (keys above A5) range. 

For both the grand and the upright piano type, the vibrating setup was pre- 
ferred to the non-vibrating condition (Fig.5.8). The mean preference scores were 
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1.05 (n = 15, SD = 1.48) for upright piano and 0.77 (n = 10, SD= 1.71) for grand 
piano. The distributions of the preference ratings did not differ significantly between 
pianos. Interestingly, while the participants generally preferred when vibrations were 
present, in the subsequent debriefing only one of them could pinpoint vibration as the 
difference between the setups. There was considerable positive correlation between 
attribute scales and frequency ranges. Ratings correlated highly between the low 
and mid ranges (mean Pearson p = 0.58) and between the mid and high regions 
(o = 0.43). At a later stage, a vibration detection sensitivity experiment conducted 
using the same setup (see Sect. 4.3) showed that piano key vibrations are perceived 
roughly up to note A4 (440 Hz). As such, the high range was entirely beyond the sen- 
sitivity range. That said, the detection experiment was performed under controlled 
timing and single notes or three-note clusters in the high range, while a free playing 
task constitutes a more ecological setting (usually involving multifinger interaction). 
This may explain the slight effect of vibration on higher frequencies in the latter. For 
further analysis, new dependent variables were formed by taking the average over 
the low- and mid-frequency ranges. Partial correlation analysis and principal com- 
ponents analysis suggested that naturalness and richness of tone were the attributes 
most associated with increased preference. 

Inter-individual consistency was low in both piano groups, suggesting high dis- 
agreement between individuals. Specifically, five participants preferred the non- 
vibrating setup. When the negative preference rating was used as a criterion for a 
posteriori segmentation [48], the attitudes of the two groups segregated clearly. While 
the negative and positive groups gave rather similar ratings for dynamic range and 
loudness, their mean ratings for richness, naturalness, and preference were clearly 
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different (Fig. 5.9). The mean preference ratings were 1.58 (n = 20, SD = 0.79) and 
—1.61 (n = 5, SD = 1.10) for the positive and negative groups, respectively. Thus, 
while 80% of the participants associated dynamic range and loudness with natural- 
ness, richness, and preference, the remaining 20% had the opposite opinion. 


5.3.2.2 Digital Piano Augmented with Vibrations 


A recent study on the effect of the nature of vibration feedback on perceived piano 
sound quality suggested that pianists may well be sensitive to the match between 
the auditory and the vibrotactile feedback [24]. The experimental setup (described 
in detail in Sect. 13.3.2) involved a digital keyboard enhanced both by realistic 
and synthetic key vibrations. Realistic vibrations were recorded from a Yamaha 
Disklavier grand piano. Synthetic vibration signals were generated using bandpass- 
filtered white noise, centered at the pitch and matching the amplitude envelope and 
energy of the recorded vibrations. They were interpolated according to key velocity 
and reproduced by transducers attached to the bottom of a digital piano. The reference 
setup consisted of auditory feedback only (A). The three test setups consisted of 
auditory feedback plus (B) recorded real vibrations, (C) recorded real vibrations 
with 9 dB boost, and (D) synthetic vibrations. Each of the test setups was compared 
to the reference setup in a free playing task, similar to what described above for 
the acoustic piano. Ratings were given on dynamic control, richness, engagement, 
naturalness, and overall preference. 

On average, participants preferred the vibrating setup in all categories except for 
naturalness in condition D (Fig. 5.10). The strongest preferences were for dynamic 
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control and engagement. Generally, condition C was the most preferred of the vibra- 
tion conditions: It scored highest on four of the five scales, although B was consid- 
ered the most natural. Interestingly, B scored lowest in all other scales. Similar to 
the Disklavier experiment discussed in the previous section, participants could be 
classified a posteriori into two groups, where median preference ratings for setup 
C were +2.0 and —1.5 for each group, respectively. In the larger group of positive 
preference (n = 8), nearly all attributes were rated positively versus only one in 
the smaller, negative group (n = 3). Notably, although auditory feedback remained 
unchanged, participants associated higher preference of the vibrating setup to rich- 
ness of tone, which, during preparation for the experiment, was explained to them as 
a sound-related attribute. This supports the hypothesis that from the perspective of 
the musician, the perception of instrument quality emerges though the integration of 
both auditory and haptic information. 


5.4 Conclusions 


The perceptual evaluation of musical instrument quality has traditionally been con- 
sidered a unisensory experience in the scientific and industrial world alike, based 
exclusively on how the produced tone sounds in terms of pitch, dynamics, articula- 
tion, and timbre. To a certain extent, this is naturally expected. After all, the objective 
of playing a musical instrument is to make (musical) sounds. But while this holds 
true for the non-musician listener, it only tells part of the story from the perspective 
of the musician, where aural impression is accompanied by haptic feedback due to 


5 The Role of Haptic Cues in Musical Instrument Quality Perception 89 


one or more bodily parts of the player physically touching vibrating components of 
the instrument. Well-established theories of sensory-motor multimodal interaction 
and auditory-tactile multisensory integration in the analytical and empirical study of 
music performance assert that haptic cues carry important information concerning 
the control of the (re)action of the instrument and thus its sound and that temporal 
frequency representations are perceptually linked across audition and touch. 

The violin and the piano offer unique example cases to examine whether the haptic 
interaction between the musician and the instrument can have a perceptual effect on 
quality evaluation. Both instruments require a significant amount of sensory-motor 
synergy to produce refined and precise sonic events, providing rich haptic feedback 
to the performer. At the same time, unlike the piano setup, violinists experience 
vibrations at other bodily parts than the hands, which makes it difficult to measure 
performance parameters and control vibrotactile feedback in normal playing exper- 
imental scenarios. The physical differences in the violin versus piano touch and the 
experimental freedoms or constraints imposed by them can help better understand the 
role of vibrotaction on the playing experience as well as the expressive possibilities 
it can afford in varying performance contexts. Particularly in the case of the piano, 
the MIDI protocol and the availability of computer-controlled keyboard instruments 
such as the Yamaha Disklavier and Bésendorfer CEUS offer fertile opportunities to 
obtain detailed piano performance data under well controlled but musically mean- 
ingful experimental conditions, although with some limitations [33]. 

Our review has shown that the vibrotactile component of the haptic feedback 
during playing, both for the violin and the piano, provides an important part of 
the integrated sensory information that the musician experiences when interacting 
with the instrument. In particular, the most recent violin and piano studies provide 
evidence that vibrations felt at the fingertips (left hand only for the violinist) can lead 
to an increase in perceived sound loudness and richness, suggesting the potential 
for more research in this direction. Investigations of the type and role of musical 
haptic feedback have also been reported for other instruments (e.g., [19, 31, 32]) 
as well as singing [47]. A vast field of topics await investigation, starting from the 
methods and aspects of instrument quality evaluation per se [15]. In which aspects 
does haptic feedback have a significant role? Which performance parameters (for 
example, timing accuracy) can be used to assess the haptic dimension in instrument 
quality perception? 
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Chapter 6 A) 
A Functional Analysis of Haptic geai 
Feedback in Digital Musical Instrument 
Interactions 


Gareth W. Young, David Murphy and Jeffrey Weeter 


Abstract An experiment is presented that measured aspects of functionality, usabil- 
ity and user experience for four distinct types of device feedback. The goal was to 
analyse the role of haptic feedback in functional digital musical instrument (DMI) 
interactions. Quantitative and qualitative human-computer interaction analysis tech- 
niques were applied in the assessment of prototype DMIs that displayed unique ele- 
ments of haptic feedback; specifically, full haptic (constant-force and vibrotactile) 
feedback, constant-force only, vibrotactile only and no feedback. From the analysis, 
data are presented that comprehensively quantify the effects of feedback in haptic 
interactions with DMI devices. The investigation revealed that the various types of 
haptic feedback applied had no significant functional effect upon device performance 
in pitch selection tasks; however, a number of significant effects were found upon 
the users’ perception of usability and their experiences with each of the different 
feedback types. 


6.1 Introduction 


Recent developments in interactive technologies have seen major changes in the 
way artists and performers interact with digital music technology. Computer music 
performers are presented with a myriad of interactive technologies and afforded near- 
complete freedom of expression when creating computer music or sound art. In real 
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time, they can manipulate multiple parameters relating to digitally generated sound; 
effectively creating gesture interfaces and sound generators that have no real-world 
acoustic equivalent. When presented with such freedom of interaction, the challenge 
of providing performers with a tangible, transparent and expressive device for sound 
manipulation becomes apparent. 

DMIs present musicians with performance challenges that are often unique to 
computer music. One of the most significant deviations from traditional musical 
instruments is the level of physical feedback conveyed by the instrument to the user. 
Currently, new interfaces for musical expression are not designed to be as physically 
communicative as acoustic instruments. Specifically, DMIs are often void of haptic 
feedback and therefore lack the ability to impart important performance information 
to the user [1]. 

In the field of human—computer interaction (HCI), the formal evaluation of an 
input device involves a rigorous and structured analysis, often involving the use of 
specific methods to ensure the repeatability of a trial. The formality of the process 
guarantees that the findings of one researcher can be applied and developed by 
other researchers. In computer music, the testing of DMIs has been highlighted as 
being unstructured or idiosyncratic [2-5] (see Sects. 5.3.2.2, 10.3.2, 11.4, 12.3 and 
12.4). However, it is arguably challenging to accurately measure and appraise the 
creative and effective application of technology in a creative context. These aspects 
of a DMI’s evaluation cannot effectively be represented by quantitative techniques 
alone. In response to these shortcomings, we seek to gather data via both quantitative 
and qualitative means, as has been seen in other studies [3]. Presented within this 
chapter is an experiment that evaluates and compares the major components of haptic 
feedback. To achieve this, the feedback mechanisms of two prototype DMIs were 
assessed, namely the Haptic Bowl and the Non-Haptic Bowl, which were augmented 
to provide vibrotactile feedback [6]. The objective of the experiment was to quantify 
the effect of haptic feedback in the performance of pitch selection tasks; specifically, 
the move time and accuracy that could be achieved with different feedback types. In 
addition to measure the device performance, the user’s perception of usability and 
their overall experiences within the context of the experiment were also captured and 
analysed. 

To formally structure the experiment, a validated framework of analysis was 
applied [7]. This DMI evaluation framework was designed to tackle the multipara- 
metric nature of musical interactions while also assessing the practical design features 
applied in the construction of a DMI. By applying a structured evaluation model, 
users’ attitudes towards functionality, usability and user experience data while under- 
taking a pitch selection task were captured. For this analysis, a pitch selection task 
was chosen to quantitatively measure user performance and maintain objectivity 
in the investigative and evaluation methodologies that were later applied. Follow- 
ing this, structured post-task questionnaires were conducted after each stage of the 
experiment to elicit further information and to closely correlate quantitative with 
qualitative data. An empathy map for each feedback stage was then constructed to 
connect in-task results with post-task questioning. 
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In accordance with the evaluation framework, the structure of the chapter is pre- 
sented as follows: each device is described and the feedback affordances they apply 
are reviewed; the experiment is then contextualised, stating the intentions and con- 
straints of the study; a functionality trial is then presented that measures the move 
time and pitch selection accuracy of the different feedback stages; the usability and 
user experience data of the study are then presented; finally, the findings of the 
analysis and post-task data are discussed and concluded. 


6.2 Experiment Design 


It has been observed that traditional evaluation methodologies from HCI are unsuit- 
able for the direct evaluation of DMIs without prior contextualisation and augmenta- 
tion [1]. This is mainly due to the complex coupling of action and response in musical 
interaction (see Sect. 2.3). These two factors operate within the tightly linked pro- 
cesses of a focused spatiotemporal task. Therefore, if this process is interrupted for an 
evaluation (e.g. for a questionnaire or thinking-aloud protocols), the participants are 
inevitably separated from their instantaneous thoughts and therefore from achieving 
their goals. Due to this, any system of analysis that is applied outside of the interac- 
tion is disconnected from the task being evaluated. Similar problems exist in other 
areas of study, for example in the evaluation of gaming controllers [8]. To counter 
this, adaptive and reflective models have been developed in HCI that concentrate 
on specific elements of an interaction, and these techniques have been augmented 
to evaluate the participants’ experience in specific contexts. In the study presented, 
several validated HCI evaluation techniques were applied to combat the potential for 
task evaluation disconnect. 


6.2.1 Functionality Testing 


To assess the functionality of the feedback elements from the Haptic and Non-Haptic 
Bowl devices, an experiment was devised which required participants to use the 
interfaces in a non-musical pitch selection task. This task was designed to generate 
quantitative data that could be used to accurately compare each feedback stage. From 
analysing the functional mechanisms of both devices, a Fitts’ Law style experiment 
was designed. 


6.2.2 Adapting Fitts’ Law 


Fitts’ Law is used in HCI to describe the relationship between movement time, 
distance and target size when performing rapid aimed movements (Fig. 6.1). Per 
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Fig. 6.1 Fitts’ Law The Target 
movement model 


Starting 
Position 


—— >) 


this law, the time it takes to move and point to a target of a specified width (W) 
and distance (D) is a logarithmic function of the spatial relative error [9]. While 
the logarithmic relationship may not exist beyond Windows, Icons, Menus, Pointer 
(WIMP) systems, the same experimental procedures can be followed to produce data 
for analysis in an auditory context [10, 11]. 

In the following experiment, we measured the time it took a participant to rapidly 
aim their movements towards a specified target pitch, which was constrained within 
a predefined frequency range. Essentially, physical distance was remapped to audio 
frequency range, where the start position corresponded to a point below 20 Hz and 
a target position that laid within a range less than | kHz. The target’s width was 
predetermined as a physiological constant of 3 Hz for sinewave signals below 500 Hz, 
increasing by approximately 0.6% (about 10 cents) as frequency increased towards 
1 kHz [12]. 


6.2.3 Context of Evaluation 


The evaluation context of the experiment was augmented to fit that of the per- 
former/composer and designer’s perspective. These stakeholders concern themselves 
with how a device works, how it is interacted with, and how the overall design of a 
system responds to interaction [13]. Considering this, the experiment was purpose- 
fully designed to objectively evaluate the performance of device feedback and not the 
musical performance of the participant. To maintain objectivity, a feedback focused 
experiment was devised and executed to quantify the device performance in pitch 
selection tasks. Secondly, validated post-task questionnaires were issued to quantify 
the usability of the device. This was achieved by employing a Single Ease-of-use 
Question (SEQ), Subjective Mental Effort Question (SMEQ) and NASA Task Load 
Index (NASA-TLX) questionnaires. Finally, interviews focusing on user experience 
were conducted as well as a User Experience Questionnaire (UEQ) to evaluate how 
the participants experienced the interaction. 
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Although post-task user experience questioning is problematic due to user dis- 
connect issues, previously validated techniques were applied to accurately evaluate 
each feedback stage. Firstly, a preference of use question was posed to the partici- 
pants to evaluate their opinion on the practical application of feedback in their own 
performances [14]. Secondly, the UEQ was completed to collect quantitative data 
about the participant’s impressions of their experience [15]. This was followed by a 
moderately structured post-task interview formulated around specific topics. These 
known areas of concern in musical interactions included learnability, explorability, 
feature controllability and timing controllability [16]. These data were then subjected 
to content analyses. The content analysis topics were designed to elicit and explore 
critical incidents [17] that have been highlighted as problematic in the field of new 
instruments for musical expression. 

Following the experiment, empathy mapping was applied in the context of user 
experience to understand and to form empathy for the end-user. This technique is 
typically applied to consider how a person is feeling and to understand what they 
are thinking better. This task was achieved by recording what the participants were 
thinking, feeling, doing, seeing and hearing as they were performing the task. With 
these data, it was possible to create a general post-experiment persona to raise issues 
specific to the context of the analysis. It is helpful to create empathy maps to reveal 
connections between a user’s movements, their choices and the judgements they 
made during the task in a way that the participants may not be able to articulate post- 
task. Therefore, empathy mapping data were recorded during the practical stages of 
the functionality study to capture instantaneous information about the participants’ 
experience without interrupting the task. Observations about what the participants 
said out loud, sentiments towards the device, their physical performance and how 
they used prior information of other devices during the experiment were recorded to 
validate and potentially expand upon the post-task questionnaire and interview data 
presented above. 


6.2.4 Device Description: The Bowls 


For the analysis of haptic feedback in DMI interactions, prototype devices were 
constructed (Fig. 6.2). Each device was designed to represent a variety of feedback 
techniques, and several different input metaphors were initially explored. From this 
assortment, two devices were selected that could display the unique characteristics 
of haptic feedback in combination and isolation, while affording the user freedom 
of movement in a three-Dimensional (3D) space around the device. Specifically, the 
Haptic Bowl and the Non-Haptic Bowl were chosen. 
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Fig. 6.2 Haptic bowl (left) and Non-Haptic bowl (centre), user for scale (right) 


6.2.4.1 The Haptic Bowl 


The Haptic Bowl is an isotonic, zero-order, alternative controller that was developed 
from a console game interface [6]. The internal mechanisms of a GameTrak! teth- 
ered spatial position controller were removed and relocated into a more robust and 
aesthetically pleasing shell. The original Human Interface Device (HID) electronics 
was removed and replaced with an Arduino Uno SMD edition.” This HID upgrade 
reduced communication latencies and allowed for the development of further device 
functionality through the addition of auxiliary buttons and switches. The controller 
has very little in the way of performer movement restrictions as physical contact 
with the device is reduced to two tethers that connect the user via gloves. Control of 
the device requires the performer to visualise an area in three dimensions, with each 
hand tethered to the device within this space. 


6.2.4.2 The Non-Haptic Bowl 


This device is also an isotonic, zero-order controller, (based upon PING)? ultrasonic 
distance sensors and basic infrared (IR) motion capture (MOCAP) cameras, thus 
affording contactless interaction. The ultrasonic components are arranged as digital 
inputs via an Arduino Micro, and MOCAP cameras were created from modified 
Logitech C170 web cameras with visual light filters covering their optical sensors 
and internal IR filters removed. An IR LED embedded in a ring was then used to 
provide a tracking source for these MOCAP cameras. The constituent components 
are all contained within an aluminium shell, similar in size and shape as the Haptic 
Bowl. The use of these sensors matched the input capabilities of the Haptic Bowl, 


"https://en.wikipedia.org/wiki/Gametrak (last accessed on 7 November 2017). 
*https://www.arduino.cc/en/Main/ArduinoBoardUnoSMD (last accessed on 7 November 2017). 
3https://www.parallax.com/product/28015 (last accessed on 7 November 2017). 
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providing a comparable interaction. However, due to its contactless nature, this input 
device has fewer movement restrictions than the Haptic Bowl. Control of the Non- 
Haptic Bowl also requires the performer to visualise a 3D area, with input gestures 
captured within a comparable space to that of the Haptic Bowl. 


6.2.5 Device Feedback Implementation 


In addition to the user’s aural, visual and proprioceptive awareness, haptic feedback 
components were incorporated into the devices to communicate performance data 
to the user. In the Haptic Bowl, additional feedback was included in the form of a 
strengthened constant-force spring mechanism for both tether points. The devices 
spring mechanisms were strengthened to further assist in hand localisation and the 
positioning effects this created in relation to the main body of the instrument. Fur- 
thermore, for vibrotactile feedback, the audio output from a sinewave-generating 
audio module was rerouted to voice-coil actuators (see Sect. 13.2) embedded in 
the device’s gloves. The sinewave audio signal was routed via a Bluetooth receiver 
embedded within the Haptic Bowl. This device was then connected to the voice-coil 
actuators contained within each of the device’s gloves [18]. Therefore, providing 
sinewave feedback in real time that is directly related to the audio output, as is 
innately delivered in acoustic musical instrument interactions. It was also possible 
to apply this vibrotactile feedback to the Non-Haptic Bowl via the same gloved 
actuators. To achieve this, the sinewave audio output was again routed through the 
same type of Bluetooth speaker, but in this case, the speaker was kept external from 
the device. The removal of the speaker from the DMI was done to highlight the 
disconnect of these feedback sources in existing DMI designs. 

From combinations formulated around these feedback techniques, it was possible 
to create four feedback profiles for investigation: 


Haptic feedback (passive constant-force and active vibrotactile feedback) 
Force feedback (passive constant-force feedback only) 

Tactile feedback (active vibrotactile feedback only) 

No feedback (no physical feedback) 


Each feedback stage operated within the predefined requirements for sensory 
feedback as outlined in earlier research [19]. 


6.2.6 Participants 


Twelve musicians participated in the experiment. All participants were recruited from 
University College Cork and the surrounding community area. The participants were 
aged 22-36 (M = 27.25, SD = 4.64). The group consisted of 10 males and 2 females. 


102 G. W. Young et al. 


All participants self-identified as being musicians, having been formally trained or 
performing regularly in the past 5 years. 


6.2.7 Procedure 


All stages of the experiment were conducted in an acoustically treated studio space. 
The USB output from each Bowl device was connected to a 2012 MacBook Pro 
Retina. The serial input data from the devices were converted into Open Sound 
Control (OSC) messages in Processing* and outputted as UDP® information. Pure 
Data (Pd) then received and processed these data. Within Pd, the coordinates over 
the z-plane were used to create a virtual Theremin,° with the right hand controlling 
the pitch, and the left hand the volume. The normal operational range of both devices 
was altered to fit within an effective working range of 30 cm; this range lay slightly 
above an average waist height of 80 cm (the average height in Ireland, as of 2007, is 
170 cm and the waist-to-height ratio calculated 0.48). A footswitch was employed 
by the participant to indicate the start and end of each test. 

After a brief demonstration, participants were given 5-min free-play to familiarise 
themselves with the operation of the device. Following this, subjects were then given 
a further five min to practice the experimental procedure. The overall total time- 
on-task varied between participants and experiment stages, but remained within an 
average range of 1.5-2 b’ total. Participants were presented with each feedback type 
in counterbalanced order (a method for controlling order effects in repeated-measures 
design). For ecological validity, participants were required to wear the device-gloves 
throughout all experimental stages. The task consisted in listening to a specific pitch, 
and then seeking and selecting that target pitch with the device as quickly and as 
accurately as possible. The listening time required for remembering the target pitch 
varied between participants from only 5 to 10 s maximum. The start position for 
all stages was with hands resting in a neutral position at the waist. In each trial, 
participants used the footswitch to start and finish recording movement data. For 
each run of the experiment, eleven frequencies were selected in counterbalanced 
order across a range of 110—987.77 Hz. All frequencies in the experiment had a 
relative pitch value. Participants performed three runs, with a brief rest between 
each. The processing patch was used to capture input movement data and the time 
taken to perform the task; these data were then outputted as a.csv file for analysis. 


4A programming environment for the visual arts: https://processing.org/ (last accessed on 26 
November 2017). 

5User Datagram Protocol, a protocol for network communication. 

®An early electronic musical instrument named after its Russian inventor Lev Theremin, in which 


the pitch and volume are controlled by the position of the performer’s hands relative to a pair of 
antennas. 
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After each feedback stage of the experiment, participants were asked to complete 
a post-task evaluation questionnaire and informal interview. All interviews followed 
the same guiding question: 


e What were the central elements of device feedback that resulted in task success or 
failure? 


This directorial question was then operationalised by the following: 


What positive attributes did the feedback display? 
What negative attributes did the feedback display? 
What features made the task a success or failure? 
Describe this success or failure in a musical context. 


Throughout the interview, interview-laddering’ was applied to explore the sub- 
conscious motives that lead to the specific criteria being raised. A Critical Incident 
Technique (CIT) analysis was then applied to extrapolate upon the interview data 
collected. This set of procedures was used to systematically identify any behaviours 
that contributed to the success (positive) or failure (negative) in the specific context. 


6.3 Results 


Functionality data were collected during the experiment so as to represent objec- 
tive and quantitative measures that impartially represent the effects of feedback in 
audio-based exercises. Following this, the validated questionnaires and qualitative 
interview techniques were undertaken to gather subjective opinions from participants. 
Participants were not made aware of these performance data when being interviewed. 


6.3.1 Functionality Results 


The results from the functionality evaluation can be seen in Fig. 6.3 and Table 6.1. An 
analysis of variance yielded no significant variations in move time for the different 
feedback types, with p > 0.05 for all frequencies. For the individual feedback stages, 
participants could target and select pitches within the predetermined target size of 
3 Hz for all frequencies below and including 261.6 Hz. As expected, the accuracy of 
pitch selection decreased with frequency increment. Above 261.6 Hz and up to and 
including 523.25 Hz, the deviation from target pitch increased, but remained within 
the expected range. Beyond this, from 523.25 Hz up to and including 975.83 Hz, the 
average deviation increased further. Notably, the no feedback stage of the experiment 
exceeded the expected deviation constant of 6 Hz for this range by 3 Hz. Like move 


7An interviewing technique where simple responses are probed and explored by the interviewer to 
discover the subconscious motives of the participant. 
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Fig. 6.3 Mean move time over frequency for all feedback stages 


Table 6.1 Average deviation from target for all feedback stages 


Frequency range (Hz) 


Feedback |< 261.6 SD 261.1 > SD 523.25 > SD 
523.25 

Haptic 0.41 0.24 0.9 0.65 4.21 2.21 

Force 0.33 0.25 0.78 0.4 5.36 4.73 

Tactile 1.03 0.62 1.7 0.98 5.1 4.18 

No 1.07 0.87 1.15 0.48 9.6 3.43 

feedback 


time measurements, although there were practical variations in the accuracy of target 
selection across all feedback stages, there was found to be no significant effect of 
feedback on the accuracy of frequency selection, with p > 0.05 for all feedback types. 


6.3.2 Usability Results 


For the SEQ, the participants were given the opportunity to consider their own perfor- 
mance and factor this into their response. Users had to fit their rating of performance 
based upon the range of answers available (7 in total) and respond to their interpre- 
tation of the difficulty of the task accordingly. The post-task SEQ answers can be 
seen in Fig. 6.4 and Table 6.2. 

For the haptic feedback stage, a larger portion of users (42%) found that the task 
was somewhat difficult for them to complete, and the perceived ease-of-use increased 
in difficulty for each feedback stage after this until the perception of performance 
decreased to a rating of very difficult (58%) for the no feedback stage. When verbally 
questioned, participants expressed that while they were fully engaged in the task, the 
perceived difficulty of performance using the devices was as it would be if they were 
performing for the first time with any new instrument. This increase in cognitive 
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Haptic 42% 8% 17% 33% 
Force 33% 16% 17% 33% 
Tactile 25% 50% 8% 17% 
Non-Haptic 58% 33% 8% 
® Very Difficult ~ Mostly Difficult Somewhat Difficult’ Neither Difficult nor Hard Somewhat Easy Mostly Easy ~ Very Easy 


Fig. 6.4 Diverging stacked bar chart for the SEQ 


Table 6.2 SEQ evaluation for all feedback stages 


Feedback Evaluation meaning Median IQR? 

Haptic Neither difficult nor 4.5 3 
easy/somewhat easy 

Force Neither difficult nor 4.5 3 
easy/somewhat easy 

Tactile Somewhat difficult 3 0.5 

No feedback Mostly difficult 2 1 


‘Inter Quartile Range 


load moved them to consider their performance more critically. Participants were 
unaware of their actual move time and accuracy scores at this point. 

A Friedman Test revealed a statistically significant effect of feedback upon SEQ 
answers across the four different feedback stages: x?(3, n = 12) = 31.75, p < 0.001. 
Following this, a Wilcoxon Signed-Ranks analysis of variance was conducted to 
explore the impact of device feedback on SEQ answers. There was found to be 
a Statistically significant effect of feedback on device scores. The effect size was 
measured from 0.34 to 0.45. Post hoc comparisons indicated that the score for the 
no feedback stage of the experiment was significantly different from the haptic and 
force stages after Bonferroni adjustment. There were found to be no significant 
differences between haptic and force feedback and the tactile and no feedback stages. 
This indicated that the participants’ perception of task difficulty was significantly 
different from no feedback when force feedback was presented in the interaction. 
Furthermore, tactile feedback played no role in this perception rating. 

In comparison to the SEQ, the SMEQ presented a near-continuous response choice 
for the participants to choose from (Fig. 6.5). Theoretically, this allowed the partic- 
ipants to be more precise regarding their estimation of the device’s usability. The 
premise of this scale was to elicit an indication of the user’s thoughts towards 
the amount of mental effort they exerted during the task. The mean value of the 
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Fig. 6.5 Boxplots representing mean SMEQ answers for each unique feedback element 


Table 6.3 SMEQ evaluation for all feedback stages 


Feedback Evaluation meaning Mean SD 

Haptic Some amount of effort | 45 22.16 

Force A reasonable amount | 45.42 16.98 
of effort 

Tactile Fair amount of effort | 62.17 13.59 

No feedback Fair amount of effort | 71.25 12.08 


SMEQ answers for each feedback type can be seen in Table 6.3. The results sup- 
port the usability analysis of the SEQ; however, this scale measured the amount of 
effort the participants felt they invested rather than the amount of effort demanded 
from them. 

A repeated-measures ANOVA was conducted to compare scores on the SMEQ 
scale. There was found to be a significant effect for feedback: F(3, 9) = 11, p=0.002, 
with partial n? = 0.79. The post hoc comparisons indicated that the score for the no 
feedback stage of the experiment was significantly different from the haptic, force 
and tactile stages. There was found to be no significant difference between haptic 
and force feedback stages. 

Following the evaluation of perceived effort, the participant’s subjective workload 
was recorded with a paper and pencil NASA-TLX assessment questionnaire. In this, 
the total workload is divided into six TLX subscales, the results of which can be seen 
in Fig. 6.6. The first indicator in the NASA-TLX subscale required the user to signify 
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Fig. 6.6 NASA-TLX subscale ratings of usability for each unique feedback element 


how demanding they found the task in terms of its complexity. The observed results 
denote that a somewhat small amount of mental and perceptual activity was required, 
indicating that the task was simple to complete for all feedback stages. Next, the mean 
physical demand of the task was measured, showing that the participants found the 
task relatively easy to complete, and that a reasonable amount of physical activity was 
demanded from them in completion of the task. In terms of temporal demand—the 
time pressure felt in performing the task—the mean user rating of the experiment 
shows that the pace of the task was realistic and that participants were not rushed, 
had plenty of time to complete the task without pressure, and that the task elements 
were presented within a realistic time frame. In the self-evaluation of performance 
in the TLX questionnaire, participants indicated that they were relatively unsatisfied 
with their own performance. 

The users’ satisfaction with the success of their performance corroborates with 
the earlier findings of negative self-satisfaction in performance of the task. It also 
highlights some difficulties in the completion of the task and that a raised mental 
awareness was required during its execution. Notably, all feedback stages were rated 
equally negatively, with no significant effect of feedback. Therefore, although a neg- 
ative evaluation of performance was recorded, there was no distinction between the 
performance of the different feedback stages as was present in the SEQ and SMEQ. 
In contrast to the self-evaluation of performance, participants indicated that they 
worked only somewhat hard mentally and physically to accomplish their level of 
performance. This indicated that the participants did not feel that they had worked 
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particularly hard to reach their overall level of performance, even though an unsat- 
isfactory evaluation of performance was measured. 

Next, participants recorded that they were not irritated or stressed by the task. 
The TLX measured relatively low frustration levels, weighting towards a relaxed 
attitude during the experiment. These results indicated that although participants 
were relatively unsatisfied with their performance, they were not stressed or unhappy. 
Finally, a mean overall “raw TLX” measure of workload was calculated to represent 
the overall TLX rating of each feedback type. Due to time restrictions, a pairwise 
comparison of each dimension was not deemed necessary and thus not undertaken. 

A repeated-measures ANOVA was conducted to compare scores on the different 
feedback stages, and although there were some noticeable variations in the mean 
scores for each category and feedback types, no significant effect of feedback was 
recorded at the p < 0.05 levels for all categories except for effort: (F(3, 9) = 4.22, 
p = 0.04, partial n? = 0.58). Post hoc testing for effort revealed that there was a 
significant difference in mean scores for perceived effort between the no feedback 
and tactile feedback stages of the experiment (mean difference = 8.42, p = 0.046). This 
indicated that participants regarded the different feedback types as equally usable 
across all TLX categories except for effort, where there was minimal difference in 
scores between the tactile and no feedback stages. 


6.3.3 User Experience Results 


The final stage of the functionality analysis incorporated a post-task assessment of 
the users’ experiences during the experiment. A pre-existing questionnaire was used 
to measure user experience quickly, simply and as immediately as possible. Six crit- 
ical aspects of experience were captured via the UEQ questionnaire: attractiveness, 
perspicuity, efficiency, dependability, stimulation and novelty (Fig. 6.7). The over- 
all internal consistency of the user experience scales was acceptable, with a = 0.88. 
However, poor internal consistencies for some of the individual feedback stages were 
observed, highlighting some disparity between participant answers. The maximum 
range was measured as —3 (very bad) and +3 (very good). However, maximum rat- 
ings have been previously reportedly as unlikely in user studies [15]; therefore, a 
more restrictive range was applied to compensate for different answer tendencies of 
the participants. For user experience measures on this scale, mean values between 
—0.8 and 0.8 are representative of a neutral evaluation of the corresponding dimen- 
sion. Values greater than 0.8 represent a positive evaluation, and values below —0.8 
represent a negative evaluation. 

A repeated-measures ANOVA was conducted to compare UEQ scores revealing 
that there were statistically significant variations in user experience answers for the 
efficiency, dependability and novelty category ratings at the p < 0.05 level. How- 
ever, pairwise comparisons of novelty with adjustments for multiple comparisons 
(Bonferroni) revealed no significant differences between the feedback stages. The 
categories of efficiency and dependability specifically relate to the user’s experience 
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Fig. 6.8 Boxplots representing UEQ efficiency and dependability for each unique feedback stage 


of the ergonomic quality aspects that were applied in the design of the Bowl devices 
(Fig. 6.8). Participants evaluated their experience of device efficiency in the chosen 
task as being quick and organised for haptic feedback reducing towards a more neu- 
tral rating as feedback was reduced in the order of force, tactile and no feedback, 
respectively. Similarly, the participants’ experience of dependability of the feedback 
stages showed the same downwards trend, with experience ratings of predictable and 
secure behaviour for haptic and force feedback being high and a much more neutral 
rating for tactile and no feedback. 

From these findings, participants rated the different feedback stages relatively 
equally for the categories of attractiveness, perspicuity, stimulation and novelty. 
Post hoc comparisons with Bonferroni adjustment indicated that the mean score for 
efficiency for force feedback was significantly different from the no feedback stage. In 
addition, the same test revealed that there were statistically significant effects between 
dependability ratings for haptic and force feedback and tactile and no feedback. 
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Table 6.4 Participant preference of use 


Feedback Evaluation meaning Median IQR 

Haptic Somewhat often 5 1.5 

Force Neither often nor 4 2 
occasionally 

Tactile Occasionally/neither | 3.5 1.25 
often nor occasionally 

No feedback Occasionally 3 2 


This significance highlighted a perceived efficiency rating difference between the 
feedback stages of force, tactile and no feedback. These perceived differences are 
interesting due to the lack of difference observed in performance. 


6.3.4 Interview Data 


Participants were asked whether they would like to use each feedback stage to per- 
form with outside of the experiment. Participants’ answers varied across the different 
feedback stages (Table 6.4). Most participants were pleased with their evaluation of 
feedback performance for each device and thought that they would use the device 
outside of the experiment. However, some users also indicated that they did not 
have an opinion about usage preference, as they would not normally use a com- 
puter interface to make music. When questioned further, users indicated that they 
were not particularly inspired by the experiment methodology, but suggested that if 
they could expand or explore the devices’ parameters further they might have rated 
it more favourably. The estimated usage ratings for the different device feedback 
stages noticeably reduced from the haptic stage through to the no feedback stage 
(Fig. 6.9). Participants who were not accustomed to performing with computer inter- 
faces expressed that they felt increasingly negative towards devices as feedback was 
reduced. 

A Friedman Test revealed a statistically significant difference in device use 
answers across the four different feedback stages, x2(3, n = 12) = 25.05, p < 0.001. 
Following this, a post hoc Wilcoxon Signed-Ranks test was conducted to explore 
the impact of device feedback on estimated use answers. There was found to be a 
statistically significant difference at the p < 0.0125 levels in device scores between 
the haptic and all other feedback stages. A medium-to-large effect size was observed 
from 0.24 to 0.44. Post hoc comparisons indicated that the score for the haptic stage 
was significantly different from the other feedback stages at the p = 0.0125 level. 
There were also significant differences in results between the no feedback stage and 
force and tactile feedback stages. This demonstrates how haptic feedback can be 
used as a preferential feature when choosing between multiple DMIs in composition 
or music performance. 
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Haptic 25% 16% 33% 17% 8% 
Force 17% 17% 26% 33% 8% 
Tactile 25% 25% 34% 8% ax 
Non-Haptic 8% 33% 17% 34% 8% 
Never Rarely Occasionally Neither Often nor Occasionally Somewhat Often Most Often Very Often 


Fig. 6.9 Diverging stacked bar chart for preference of use evaluation 


Participants were asked open-ended questions to gauge their opinions about the 
different feedback stages. These questions were then expanded upon in an interview, 
with care taken not to bias the participants’ responses. A CIT analysis was conducted 
based upon the participant’s answers to record the users’ attitudes to the different 
feedback types. Content analysis techniques were then applied to categorise the 
responses into areas of concern; these included: personal preference, playability, 
comparison to other musical instruments, learnability, comparison to other DMIs, 
explorability and tempo. 

From the interview transcripts, coherent thoughts and single statements were iden- 
tified and extracted. After redundancy checking, a total of 322 single statements were 
counted (M = 80.5, SD = 15.77, per feedback stage). Following this, three researchers 
were independently employed to iteratively classify this pool of statements as either 
“positive” or “negative” performance evaluations. Although this process was initially 
reductive, a second analysis of the data was used to develop a bottom-up categorical 
system of classifications to known areas of concern in musical interactions: learn- 
ability, explorability, feature controllability and timing controllability [16]. 

Participants were inclined to be positive about the haptic feedback stage of the 
experiment and were pleased with the amount of feedback that was delivered, see 
Table 6.5. It was noted that participants were more vocal about their experiences at 
this stage than for the tactile and no feedback stages. The CIT highlighted personal 
preference as the most reported aspects of user experience at this stage. These com- 
ments highlighted the overall enjoyment of participants when interacting with the 
device. However, while many comments were positive, participants highlighted some 
negative ergonomic aspects of the interaction as well. Comments about playability 
mainly focussed on interaction difficulties during the task. However, many remarks 
made in the playability category were positive. These demonstrated an appreciation 
for the increased performance information provided by haptic feedback. Participants 
expressed a partiality for familiar feel to the interface, which they felt increased 
their attention to their actions. This showed that if care was taken to provide haptic 
feedback in DMI designs, the end-user may gain an increased sense of awareness 


112 G. W. Young et al. 


Table 6.5 Content analysis for haptic feedback 


Comments 

CIT categories Positive Negative Total 
Personal preference 17 2 19 
Playability 11 4 15 
Comparison to other 9 4 13 
musical instruments 

Learnability 11 2 13 
Comparison to other 9 3 12 
DMIs 

Explorability 6 4 10 
Tempo 5 5 10 
Total 68 24 92 


of their interaction, without involving overly complicated mechanisms or device 
processing power. The comparison to other musical instruments category produced 
several interesting responses in comparison to the other feedback stages. Specifically, 
comments that compared the device directly with acoustic instruments provided an 
interesting insight into the combination of force and tactile feedback. Learnability 
was seen more positively here than for the force and tactile feedback alone. These 
findings have been observed in other research areas, most notably in [20]. The cat- 
egory containing the most negative remarks was tempo. The comments expressed 
here all indicated that a tempo-based task would be very problematic to perform and 
positive comments indicated that it would be challenging to accomplish. 

Table 6.6 shows the results of the content analysis of the force feedback stage of 
the experiment. This stage of the experiment received the same number of positive 
comments as the haptic stage; however, it also received more negative comments. As 
with the haptic feedback stage, force feedback received noticeably more comments 
than the tactile and no feedback stages of the experiment. Again, the category that 
contained the most comments was the personal preference category; however, the 
categories following this varied from the haptic feedback stage. 

The personal preference category of the force feedback stage contained comments 
discussing the novelty of the design and how the users found it interesting to use. 
There were also several positive comments focussing on simplicity and accessibility 
of the interface. However, some comments fixated negatively on the way pitch selec- 
tion was achieved and the quality of sound reproduction from the small-embedded 
speaker. Participants were more inclined to refer to other instruments in the compar- 
ison to other musical instruments category compared to the haptic feedback stage; 
however, some comments were critical of the lack of input gestures available to use. 
This further highlighted the restrictive nature of functionality focused experimen- 
tation. Comments in the playability category discussed the implication of physical 
requirements for playing the device, either praising its accessibility or commenting 
on the interface requirements for interaction. The group containing the most negative 
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Table 6.6 Content analysis for force feedback 


Comments 

CIT categories Positive Negative Total 
Personal preference 15 5 20 
Comparison to other | 11 7 18 
musical instruments 

Playability 9 7 16 
Comparison to other | 14 1 15 
DMIs 

Learnability 11 0 11 
Explorability 6 4 10 
Tempo 2 8 10 
Total 68 32 100 


Table 6.7 Content analysis for tactile feedback 


Comments 

CIT categories Positive Negative Total 
Personal preference 9 4 13 
Comparison to other 5 4 9 
musical instruments 

Playability 1 

Comparison to other 5 

DMIs 

Learnability 7 1 

Explorability 

Tempo 0 6 

Total 37 23 60 


remarks was again the tempo category. Comments made here referred to issues of 
envelope attack time, jumps in pitch and concerns about accuracy. 

Table 6.7 shows the results of the content analysis of the tactile feedback stage. 
Participants were more conservative with comments, suggesting that there were not 
as many aspects of this feedback stage that were worthy of note. However, this may 
be attributable to the conservative nature of the participant pool. The categories that 
contained the most responses were personal preference, comparison to other musical 
instruments and playability. 

The personal preference category contained the largest amount of participant com- 
ments. This category also contained the most positive comments. These comments 
mainly reflected how the participants felt about the interaction and their curiosity 
about tactile feedback. However, some participants viewed the interaction as unpre- 
dictable and inaccurate. Comments in the comparison to other musical instruments 
category talked about how the interactions were in comparison to the participants’ 
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Table 6.8 Content analysis for no feedback 


Comments 

CIT categories Positive Negative Total 
Personal preference 5 7 12 
Comparison to other 4 7 11 
DMIs 

Playability 3 8 11 
Comparison to other 4 6 10 
musical instruments 

Learnability 7 1 

Explorability 5 

Tempo 1 

Total 29 37 66 


own instruments and compared accuracy between the two types of instrument. The 
playability category contained the highest number of negative comments. The par- 
ticipants were particularly focused on their own perception of lack of accuracy and 
precision in their movements. 

Finally, the results from the no feedback stage of the experiment can be seen in 
Table 6.8. This feedback stage yielded a high number of comments about personal 
preference, comparison to other DMIs and playability issues. The negative personal 
preference comments highlighted the participants’ frustrations at the lack of feedback 
provided. Positive comments were directed to the novelty and fun factor of the 
interaction. Participants were more inclined to compare the no feedback stage of 
the experiment with other DMIs, as seen in the comparison to other DMIs category. 
Many of the comparisons were negative, focussing again on the perceived inaccuracy 
of their movements. Positive comments highlighted the differences to other DMI 
interaction types. As with the tactile feedback stage of the experiment, the playability 
category contained the most negative comments. These comments mainly focused 
on the perceived accuracy of the interaction, with a few comments about creative 
application. 


6.3.5 Empathy Mapping 


Empathy mapping results are represented in Figs. 6.10, 6.11, 6.12 and 6.13 showing 
little deviation from observed actions during the functional task and verbal explana- 
tions of answers in the interview; this serves to further validate the analysis techniques 
applied. 


6 A Functional Analysis of Haptic Feedback ... 115 


Haptic Feedback 


It’s a lot more involving 
than the other devices. 


it’s more 
interesting and 


Feeling the 
vibrations makes it 
feel more like an 

instrument. 


it was sometimes 
difficult to control 
the pitch. 


like I've used 
before. 


Having force 
feedback and 
haptic feedback 
are good. 


Movements 
were too 


The tuning of just 
intervals on a violin 
has the same touch 
feeling as this. 


Closer to a classical 
instrument 
experience. 


pedal was 
frustrating to 


This device is 
pretty cool, 
to be honest. 


Closer to a classical 
instrument 
experience. 


Pain Gain 
It would be difficult to play fast tempos. 
Difficult to alter the attack. 


It would take practice to play a melody. 


Increased interest and novelty. 
More information and help from the device. 


Amusing and fun. 


Fig. 6.10 Empathy mapping for Haptic feedback 


interfaces give you 
more for less in 
making sound; 

however, they are 
also much more 

limited. 


Precise and stable as most 
other MIDI instruments I've 
used, but with a more 
interesting physical element. 


Force Feedback 


i was 
disorientating to 


being able to 
change and fix the 
timbre. 
This device is s 
completely unlike 

any I've used 
before. 


Sliding pitches were 
almost the same as 

on musical 
instruments. 


The violin | play is t's very forceful, 


more intuitive in 1 felt in control of 


some ways, but 
physically draining. 


ery cool, fun, and interesting 
compared to acoustic 
instruments that are harder to 


Rhythm was difficult. 
Sound quality was lacking. 


Pain The attack was hard to alternate. 


Novel and unique in 
comparison to devices 


Not really 
hard to 
learn. 


More interesting. 
Performer orientated. 


that are more direct 
instrument metaphors. 


Reassurance of input actions. 


feedback was 
better than 
without it. 


Gain 


Fig. 6.11 Empathy mapping for force feedback 


116 G. W. Young et al. 


It makes me feel more 
connected to a physical 
instrument even though 


It’s was difficult to 
play and feels 


It doesn’t really 
feel like anything 


It felt new and | really 


The shape of the 
The gloves give ita I've used in the liked the vibration hand altered the 
tactile sensation not ultrasonic rebound. 
unlike the buzzing 


violin on your 
shoulder. 


Unpredictable 
movements. 


Increased 
concentration. 


This is easier than other 
wireless devices because 
of the tactile feedback. 


Interference 
when hands 


It's seldom to have 
“motion detection” 


Quick macro 
devices to interact well as movements, slow 
with music. hearing, | micro adjustments. 


liked it. 


| really like the 
vibrating glove 
feedback. 


This is easier than other 
wireless device because 
of the tactile feedback. 


| think | took more time to 
find the correct pitches 

because it was sensitive to 

movement. 


Pain perceived increase in time taken. Fun to use. Gain 
Perceived decrease in accuracy. Novel. 
Unpredictable hand movements. Interesting to use. 


Fig. 6.12 Empathy mapping for tactile feedback 


Initially it felt difficult 
to coordinate the 
footswitch and the 
hand sensor. 


No Feedback 


It was hard to 
sound artistic on 
this interface. 


QThink & Feel? 


Simple concept 
to learn, 


The pitch tracking was 
similar to other devices 
I've used. 


Most other 
interfaces involve 
sitting down. 


Constant sound 

and glissando 
make it harder 
to play. 


Difficult to figure 
out the range. 


It's similar to other 
devices, just not as 
accurate. 


Fun to use and to 
replicate the 
frequencies 

could have 


No vibration, no 
touching, no 
comparison. 


Abit 


It’s harder to make it 
do what you want it to 
do. 


Tempo would be too 
hard. 


Quality of sound could 
be improved. 


Pain s Gain 
Difficult to control. 
Maxed out concentration. 


Frustration. 


Easy to draw comparisons between 
movement and sound. 
Freedom of movement. 


Fig. 6.13 Empathy mapping for no feedback 
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6.4 Discussion 


In the functional analysis, participants could select the specific pitches with observ- 
able increases in mean move time across the four stages of feedback. However, the 
statistical analysis of mean move time variance between each feedback stage pre- 
sented with no significant effect for feedback. This indicated that, although there 
was evidence of some practical differences between feedback types, haptic feedback 
and its derivatives had no consistent effect upon move times in pitch selection tasks. 
This finding supports the argument that haptic feedback has no significant effect 
upon a device’s performance in functional device evaluation exercises. Furthermore, 
the accuracy of pitch selection across the different feedback stages also varied with 
frequency. Mean deviation from the target frequency did so over three distinct band- 
widths. For waveforms below 500 Hz, the predetermined physiological constant was 
maintained, with frequencies above this threshold increasing in deviation by approx- 
imately 0.6%. The mean accuracy figures for each feedback stage presented with no 
significant differences; however, there was again evidence of practical differences. 
These findings further support an argument that haptic feedback may have no sig- 
nificant quantitative effect upon a device’s performance in auditory pitch selection 
exercises. 

For the SEQ, it was found that when participants were given the opportunity to 
evaluate their own performance, they rated themselves differently for each feedback 
type. Participants evaluated the difficulty of the task with tactile and no feedback as 
being more challenging than with haptic and force feedback. There was no signif- 
icant difference between the haptic and force feedback stages or the tactile and no 
feedback stages, indicating that tactile feedback had no effect upon the participant’s 
perception of ease-of-use. However, from these observations, force feedback can be 
seen as having some positive effect. Although the quantitative measures of perfor- 
mance indicated that there was no significant difference in move time and accuracy, 
participants were inclined to be more self-critical of their performance than necessary 
when feedback was altered or removed. Many participants indicated that, although 
they found the task difficult across all stages, their level of engagement varied, as it 
would if they were performing for the first time with any new acoustic instrument. 

The SMEQ further supported these findings, with ratings showing that some 
amount of effort to a fair amount of effort was required to perform the exercises. 
However, the SMEQ presented a different focus than that of the SEQ, as it measured 
the perceived amount of mental effort applied during the task. The results showed that 
the amount of mental effort required increased as feedback was removed, although the 
actual quantified performance of the different feedback stages did not significantly 
differ. These differences were significant between the haptic and force feedback 
stages and the no feedback stage. Tactile feedback did not differ significantly from 
any other stage. Furthermore, the perception of increased mental effort was also 
indicated as being a significant effector during the user experience analysis. From 
analysing the functional data and comparing them to the participant’s perception 
of mental effort and ease-of-use, it was observed that force feedback was the most 
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influential feedback type, with no significant effect observed for tactile feedback. 
However, with the addition of tactile feedback to force feedback, there were also no 
detrimental effects on the user’s performance ratings. 

The overall raw usability testing revealed no significant effect of feedback across 
all feedback stages; however, the data collected did reveal some interesting results. 
For example, the self-measure of performance on the NASA-TLX scale was found 
to be reasonably poor for all feedback types. This indicated that participants were 
equally negative about how successful and satisfied they were with their performance 
across for all feedback types. The results also indicated that haptic feedback and its 
constituent parts each played some part in the reduction of participants’ perception of 
mental demand. The combination of TLX, SEQ and SMEQ usability ratings indicate 
that a general level of dissatisfaction with performance for each feedback type was 
noted. 

The UEQ data from the study highlighted a significant difference between the 
users’ experience of efficiency and dependability across all feedback stages. For 
efficiency ratings, significant differences were observed between haptic and force 
feedback and tactile and no feedback ratings. This denoted that the evaluation of the 
participants’ experience of work performed to total effort expended was not affected 
by tactile feedback, but by force feedback alone. Similarly, the participants’ appraisal 
of dependability displayed the same evaluation characteristics. The participants’ 
experience and assessment of device reliability showed that they felt that the tactile 
and no feedback stages were less reliable than the haptic and force stages, regardless 
of there being no measurable effect of feedback in accuracy and move time. 

Subsequently, critical incidents for each feedback stage were assessed. Overall, 
the CIT analysis revealed some interesting trends. The most obvious of these was the 
decrease in positive comments and the increase in negative comments made as feed- 
back was removed from the interaction. Additionally, participants were particularly 
more vocal about their personal preferences when interacting with each feedback 
stage. This trend highlighted the importance of performer individuality and prior 
experiences when designing, building and using a DMI device with feedback. This 
would imply the need for a more explorative investigation methodology in the evalu- 
ation of experience. This aspect could be further expanded upon in user case studies 
and involve the further consideration of creative applications in its analysis. 

With the specific matching and categorisation of the devices and the quantitative 
and qualitative data recorded during functionality testing, the results of the experi- 
ment showed that the effect of haptic feedback and its derivatives could be measured 
in the operation of a DMI, with accurate data measures. These findings denoted inter- 
esting results for the different types of feedback displayed to the user, and although 
there was no direct affect upon the quantitative performance of the DMI, feedback 
may still be revealed to have some positive influence upon the user’s perceptual 
experience when applying them in note-level-control metaphors, musical exercises, 
and explorative or creative contexts. 

The discipline of HCI has a wide range of evaluation frameworks for the appraisal 
of digital technology as applied to simple, multiparametric tasks. This includes evalu- 
ation techniques that are designed to discover issues that arise in unique applications 
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of technology, such as the effects of haptics in DMI design. For the appraisal of 
complex devices, HCI evaluation techniques can be incorporated in the evaluation 
of usability and user experience. In addition to this, the subject of human comput- 
ing (or human-centred computing) can also be used to evaluate the user’s intentions 
and motivations in the application of technology in creative contexts. As has been 
presented here, an appraisal of function, as a task-focused approach, presents met- 
rics that are easy to measure and quantify. However, in the creation of music, the 
application of technology relies upon the user’s previous training and experiences to 
accurately express the musicians’ inner thoughts and intentions. 

It is therefore proposed that, although DMIs require functional testing to highlight 
potential usability issues, acomprehensive analysis should also include the evaluation 
of real-world situations to accurately capture and evaluate all aspects of an interaction. 
Thus, to expand our investigation of haptics into the real world, a music-focused anal- 
ysis should also be undertaken. This idea emphasises the “third paradigm” concept, 
which includes the gathering of information relating to culture, emotion and previ- 
ous experience. Our results show task-focussed evaluations are indeed a necessary 
precursor to experience-focussed assessment. However, task-focussed evaluations, 
when carried out in isolation, do not present sufficient information about the user or 
device in real-world applications of such technology. 

Interaction information pertaining to acoustic musical instrument design already 
exists; therefore, data can be measured and used in DMI interaction design to provide 
a sense of realism and embodiment to virtual or augmented instruments or expanded 
upon to fit new design types [21]. Many digital musicians are recognised for their 
creativity, innovation and adaptation in the design and construction of DMIs; how- 
ever, these digital instruments are often still devoid of haptic feedback. It is possible 
to reconstruct the operating principles of acoustic instruments and apply them to 
DMIs, as is seen in augmented instruments and DMIs that replicate the playing style 
of an acoustic instrument. For a performer, however, the emptiness of assignable 
“button bashing” may be seen as a negative characteristic. DMIs offer freedoms to 
musicians that are near endless, but digital music performers often also play conven- 
tional instruments, highlighting the need to experience the creation of music with all 
senses engaged. 

If multimodal collocations are possible within DMI design, it should also be pos- 
sible to simulate the haptic experience of an acoustic performance. Sound can be 
created electronically with the freedoms afforded through digital sound generation 
and with the combined information of the interaction response being fed back with 
comparable meaning as an acoustic instrument. Sound can be digitally created and 
manipulated by the artist, and a deeper sense of craft can potentially be realised. Com- 
puter musicians need to be able to experience consistency, adaptability, musicality 
and touch-related sensations that are induced by touch to experience the physiologi- 
cal and psychological occurrences outlined within each of the research conclusions 
presented here. 
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6.5 Conclusions 


In this chapter, it has been seen that the addition of haptics to DMI feedback 
archetypes enhances the user experience, but does not appear to impact on the effec- 
tiveness (move time) or accuracy of the functional elements of DMIs. Additionally, 
from the analysis of feedback in auditory interactions, it has been demonstrated how 
a HC]I-informed framework can be applied in the evaluation of DMI design. Specif- 
ically, it was observed how a device’s analysis can be informed by HCI techniques 
that are applied in the evaluation of general computing and computing for unique or 
creative applications. Regarding the experimental results presented here, the func- 
tional capacity of haptic, force, tactile and no feedback afforded to users in tasks 
that require the selection of specific frequencies was quantified and evaluated. The 
accumulation of differences observed within this analysis revealed influential factors 
of information feedback on the user’s experiences in functional application contexts. 

From the data gathered, DMI feedback appeared to be influential on several context 
dependent levels. In the study, there was found to be no significant effect of feedback 
upon the quantifiable performance capacities of the tested feedback stages. However, 
when questioning the participants further, there were discovered to be important 
inequalities in the perception of usability and experience when completing the task. 
Within these areas, the musician’s perception of performance was found to be more 
favourable with the presence of both tactile and force feedback. Therefore, it can be 
concluded from this experiment that haptic feedback has some positive effect upon 
many perceptual experiences in the application of DMI technology and should be 
further investigated in the field. 

It is expected that the study of interactions between performers and digital instru- 
ments in a variety of contexts will continue to be of research interest. Research on 
digital musical instruments and interfaces for musical expression will continue to 
explore the role of haptics, incorporating user experience and the frameworks that 
are constructed to quantify the relationship between musical performers and new 
musical instruments. The complexities of these relationships are further complicated 
by the skills of musicians and are far greater and more meaningful than a physically 
stimulating interaction. 

It has been shown in this work that digital musical instrument design and evalu- 
ation methodologies can be applied in the study of interactions between musicians 
and instrument. However, it is suggested that emergent DMI systems require further 
measures for an accurate appraisal of the user’s experience when applying the device 
in a musical context [22]. In a traditional HCI analysis, a device is evaluated in a 
specific context and the evaluation methods are expert-based heuristic evaluations 
or user-based experimental evaluations. Only by determining context is it possible 
to interpret correctly the data gathered. Therefore, it is suggested that DMI-specific 
functionality, usability and user experience evaluation methods should be developed. 

The work presented has only begun to explore the possibilities of haptic feedback 
in future DMI designs. The experiment endeavoured to present evidence of some 
influence that haptic feedback has on a user’s perception of functionality, usability 
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and user experience. Beyond this, future research goals should include long-term 
studies, and the development of tools to assist in the creation of DMI designs, to allow 
designers experiment with different gestural interface models. Within this space, 
composers, performers and DMI designers will be able to explore the affordances of 
technologies in the creation of new instruments for musical expression. 
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Chapter 7 A) 
Auditory-Tactile Experience of Music ciecie; 


Sebastian Merchel and M. Ercan Altinsoy 


Abstract We listen to music not only with our ears. The whole body is present in 
a concert hall, during a rock event, or while enjoying music reproduction at home. 
This chapter discusses the influence of audio-induced vibrations at the skin on musi- 
cal experience. To this end, sound and body vibrations were controlled separately 
in several psychophysical experiments. The multimodal perception of the resulting 
concert quality is evaluated, and the effect of frequency, intensity, and temporal vari- 
ation of the vibration signal is discussed. It is shown that vibrations play a significant 
role in the perception of music. Amplifying certain vibrations in a concert venue or 
music reproduction system can improve the music experience. Knowledge about the 
psychophysical similarities and differences of the auditory and tactile modality help 
to develop perceptually optimized algorithms to generate music-related vibrations. 
These vibrations can be reproduced, e.g., using electrodynamic exciters mounted to 
the floor or seat. It is discussed that frequency shifting and intensity compression are 
important approaches for vibration generation. 


7.1 Introduction 


Several chapters in this book discuss the influence of haptic cues provided by instru- 
ments to musicians. Usually, the forces and vibrations at the skin are directly excited 
by a physical contact with the instrument. However, the radiated sound itself can 
stimulate the surface of the human body too. This is true for musicians and music 
listeners alike. The main hypothesis to be evaluated in this chapter is that vibrations 
at the listeners skin might be important for the perception of music. If the vibra- 
tory component is missing, the perceived quality might change, e.g., for a concert 
experience. From another perspective, the perceived quality of a concert hall or a 
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conventional audio reproduction system might be improved or impaired by adding 
vibrations. These vibrations can be excited directly via the air or via the surfaces that 
are in contact with the listener. This study focuses on seat vibrations, such as those 
that can be perceived in a classical chamber concert hall. Measurements in an exem- 
plary concert hall and a church confirmed the existence of seat vibrations during real 
music performances [27]. If a kettledrum is hit or the organ plays a tone, the ground 
and chair vibrate. The vibratory intensity and frequency spectra are dependent on 
various factors, e.g., room modes or construction parameters of the floor. Neverthe- 
less, in many cases, the concert listener may not recognize the vibrations as a separate 
feature because the tactile percept is integrated with the other senses (e.g., vision and 
hearing) into one multimodal percept. Even if the listener is unaware of vibrations, 
they can have an influence on recognizable features of the concert experience, e.g., 
the listener’s presence or envelopment—parameters that are of vital importance in 
determining the quality of concert halls [8]. 

Unfortunately, there is no vibration channel in conventional music recordings. 
Therefore, it would be advantageous if a vibration signal could be generated using 
the information stored in existing audio channels. This approach might be reasonable 
because the correlation between sound and vibration is naturally strong in everyday 
situations. 

Two pilot experiments were conducted and described by Merchel et al. [24, 25], 
who investigated the influence of seat vibrations on the overall quality of the repro- 
duction of concert DVDs. Low-pass-filtered audio signals were used for vibration 
generation through a shaker mounted to a seat. In many cases, participants preferred 
when vibrations were present, instead reporting that something was missing if seat 
vibration was turned off. However, different complaints were reported: It was stated 
that the high-frequency vibrations were sometimes prickling and therefore unpleas- 
ant; several participants reported that some vibrations were too strong and that others 
were too weak or completely missing; it was also noted that the sound generated by 
the vibration chair at higher frequencies (indeed, a side-effect) was disturbing. In the 
aforementioned experiments, a precisely calibrated vibration actuator was applied 
that was capable of reproducing frequencies from 10 to 200 Hz and higher. In practi- 
cal applications, smaller and less expensive vibration actuators would be beneficial, 
however these shakers are typically limited to a small frequency range around a 
resonance frequency or they are not powerful enough for the present application. 

Our work aims to broaden the understanding of the coupled perception of music 
and vibration by addressing the following questions: Can vibration-generation algo- 
rithms be obtained that result in an improved overall quality of the concert experience 
compared with reproduction without vibration? Which algorithms are beneficial in 
terms of silent and simple vibration reproduction? In this chapter, algorithms are 
described that were developed and evaluated to improve music-driven vibration gen- 
eration, taking into account the above questions and complaints. The content is based 
on several papers [3, 27, 28] and the dissertation of the first author with the title 
‘Auditory-Tactile Music Perception’ [23] with kind permission from Shaker Verlag. 
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7.2 Experimental Design 


In this section, the applied music stimuli, the experimental setup, participants, and 
procedure are described. Different vibration-generation approaches will be discussed 
and evaluated in the following section. 


7.2.1 Stimuli 


To represent typical concert situations for both classical and modern music, four 
sequences were selected from music DVDs [7, 21, 45, 46] that included significant 
low-frequency content. A stimulus duration of approximately 1.5 min was chosen 
to ensure that the participants had sufficient time to become familiar with it before 
providing quality judgments. The following sequences were selected: 


Bach, Toccata in D minor (church organ) 

Verdi, Messa Da Requiem, Dies Irae (kettledrum, contrabass) 

Dvorak, Slavonic Dance No. 2 in E minor, op. 72 (contrabass) 

Blue Man Group, The Complex, Sing Along (bass, percussion, kick drum) 


The first piece, Toccata in D minor, is a well-known organ work that is referred to as 
BACH. A spectrogram of the first 60s is plotted in Fig. 7.1a, which shows a rising and 
falling succession of notes covering a broad frequency range, as well as steady-state 
tones with a rich overtone spectrum that dominate the composition. Strong vibrations 
would be expected in a church for this piece of music [27]. The second sequence, Dies 
Irae, abbreviated as VERDI, is a dramatic composition for double choir and orchestra. 
A spectrogram is plotted in Fig.7.1b: Impulsive fortissimo sections with a concert 
bass drum, kettledrum, and tutti orchestra alternate quickly with sections dominated 
by the choir, bowed instruments, and brass winds. The sequence is characterized by 
strong transients. The third stimulus, Slavonic Dance No. 2 in E minor, is referred 
to as DVORAK, and is a calm orchestral piece, dominated by bowed and plucked 
strings. Contrabasses and cellos continuously generate low frequencies at a low 
level (see spectrogram in Fig.7.2a). The fourth sequence, Sing Along, is a typical 
pop music example performed by the Blue Man Group, which is further shortened 
to BMG. The sequence is characterized by the heavy use of drums and percussion. 
These instruments generate transient content at low frequencies, which can be seen 
in the corresponding spectrogram in Fig. 7.2b. Additionally, a bass line can be easily 
identified. 

To generate a vibration signal from these sequences, the sum was calculated of 
the low-frequency effects (LFE) channel and the three respective frontal channels. 
No low-frequency content was contained in the surround sound channels in any 
situation. Pure Data (Pd) was used for this purpose. During the process, several 
signal processing parameters were varied: A detailed description of the different 
approaches is presented in Sect. 7.3. 
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(b) VERDI sequence 


Fig. 7.1 Spectrograms of the mono sums for 60s from the BACH and VERDI sequences. The 
short-time Fourier transforms (STFTs) were calculated with 8192 samples using 50% overlapping 
Hann windows 
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Fig. 7.2 Spectrograms of the mono sums for 60s from the DVORAK and BMG sequences. The 
short-time Fourier transforms (STFTs) were calculated with 8192 samples using 50% overlapping 
Hann windows 
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7.2.2 Synchronization 


For a good multisensory concert experience, it is recommended that input from all 
sensory systems should be integrated into one unified perception. Therefore, the delay 
between different sensory inputs is an important factor. Many published studies have 
focused on the perception of synchrony between modalities, mostly related to audio- 
visual delay (e.g., [12, 38]). Few studies have focused on the temporal aspects of 
acoustical and vibratory stimuli. These studies have differed in the types of repro- 
duced vibration (vibrations at the hand, forearm, or seat vibration), types of stimuli 
(sinusoidal bursts, pulses, noise, instrumental tones, or instrumental sequences), and 
experimental procedures (time-order judgments or the detection of asynchrony). 
However, some general conclusions can be drawn. 

It was reported that audio delays are more difficult to detect than audio advances. 
Hirsh and Sherrick [17] found that a sound must be delayed 25 ms against hand- 
transmitted sinusoidal bursts to detect that the vibration preceded the sound. However, 
vibrations had to be delayed only 12 ms to detect asynchrony. A similar asymmetry 
was observed by Altinsoy [1] using broadband noise bursts reproduced via head- 
phones and broadband vibration bursts at the fingertip: Stimuli with audio delays 
of approximately 50 to —25ms were judged to be synchronous, and the point of 
subjective simultaneity (PSS) shifted toward an audio delay of approximately 7 ms. 
Detection thresholds for auditory-tactile asynchrony appear to also depend on the 
type of stimulus. In an experiment reproducing broadband noise and sinusoidal seat 
vibrations, audio delays from 63 to —47 ms were found to be synchronous [2]. Using 
the same setup, audio delays from 79 to —58ms were judged to be synchronous 
regarding sound and seat vibrations from a car passing a bump [2]. 

For musical tones, the PSS appears to vary considerably for instruments with dif- 
ferent attack or decay times. For example, PSS values as high as —135 ms for pipe 
organ or —29 ms for bowed cello have been reported [9, 43]. In contrast, PSS values 
as low as —2 ms for kick drum or —7 ms for piano tones were obtained [43]. Simi- 
larly, low PSS values were obtained using impact events reproduced via a vibration 
platform [22]. 

Thus, auditory-tactile asynchrony detection appears to depend on the reproduced 
signal. Impulsive content is clearly more prone to delay between modalities. Because 
music often contains transients, the delay between sound and vibration in this study 
was set to 0 ms. However, for a real-time implementation of audio-generated vibration 
reproduction, a slight delay appears to be tolerable or even advantageous in some 
cases. Additionally, the existence of perceptual adaptation mechanisms—which can 
widen the temporal window for auditory-tactile integration after prolonged exposure 
to asynchronous stimuli—has been demonstrated [37]. 
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7.2.3 Setup 


A reproduction system was developed that is capable of separately generating seat 
vibrations and sound. A surround setup was used, according to ITU-R BS.775-1 [18], 
with five Genelec 8040A loudspeakers and a Genelec 7060B subwoofer. The system 
was equalized to a flat frequency response at the listener position. To place the 
participant in a standard multimedia reproduction context, an accompanying movie 
from the DVD was projected onto a silver screen. The video sequence showed the 
stage, conductor, or individual instrumentalists while playing. 

Vibrations were reproduced using a self-made seat based on an RFT Messelek- 
tronik Type 11076 electrodynamic shaker connected to a flat, hard wooden board 
(46cm x 46cm). Seat vibrations were generated vertically, as shown in Fig. 7.3. 

The participants were asked to sit on the vibration seat, with both feet flat on the 
ground. If necessary, wooden plates were placed beneath the participant’s feet to 
adjust for different lengths of legs. The transfer characteristic of the vibrating chair 


Fig. 7.3 Vibration chair 
with electrodynamic exciter 
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Fig. 7.4 Body-related transfer functions measured at the seat surface of the vibration chair, with 
and without compensation plotted with 1/24th octave intensity averaging 


(relation between acceleration at the seat surface and input voltage) was strongly 
dependent on the individual person. This phenomenon is referred to as the body- 
related transfer function (BRTF). Differences of up to approximately 10 dB have been 
measured for different participants [5]. Considering the just-noticeable difference in 
thresholds for vertical seat vibrations, which is approximately 1dB [6, 13, 36], 
the individual BRTFs should be compensated for during perceptional investigations. 
The BRTF of each participant was individually monitored and equalized during all 
experiments. Participants were instructed not to change their sitting posture after 
calibration until the end of the experiment. The transfer functions were measured 
using a vibration pad (B&K Type 4515B) and a Sinus Harmonie Quadro measuring 
board, and they were compensated for by means of inverse filtering in MATLAB. 
This procedure resulted in a flat frequency response over a broad frequency range 
(+2dB from 10 to 1000Hz). An exemplary BRTF, with and without individual 
compensation, is shown in Fig. 7.4. 


7.2.4 Participants 


Twenty participants voluntarily participated in this experiment (14 male and six 
female). Most of them were students between 20 and 55 years old (mean 24 years) 
and between 58 and 115kg (mean 75 kg). All of the participants stated that they had 
no known hearing or spine damage. The average number of self-reported concert 
visits per year was nine, and ranged from one to approximately 100. Two partic- 
ipants were members of bands. The preferred music styles varied, ranging from 
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rock and pop to classical and jazz. Fifteen participants had not been involved in 
music-related experiments before, whereas five had already participated in two sim- 
ilar pilot experiments [24, 25]. 


7.2.5 Procedure 


The concert recordings were played back to each participant using the audio setup 
described above, while vibrations were reproduced using the vibration chair. The 
vibration intensities were initially adjusted so that the peak acceleration levels 
reached approximately 100dB dB (re 10~° m/s”), which were clearly perceptible. 
However, perception thresholds can vary heavily between participants [32]; therefore, 
each participant was asked to adjust the vibration amplitude to the preferred level. 
This adjustment was typically performed within the first 5—10s of a sequence. Sub- 
sequently, the participant had to judge the overall quality of the concert experience 
using a quasi-continuous scale. Verbal anchor points ranging from bad to excellent 
were added, similar to the method described in ITU-T P.800 [19]. Figure 7.5 presents 
the rating scale that was used. 

To prevent dissatisfaction, the participants could interrupt the current stimulus as 
soon as they were confident with their judgment. The required time varied from 30s 
to typically no more than 60s. After rating the overall quality, the participants were 
encouraged to briefly formulate reasons for their judgments. 

Each participant was asked to listen to 84 completely randomized stimuli, 21 for 
each music sequence. The stimuli were divided into blocks of eight. After each block, 
the participant had the opportunity to relax before continuing with the experiment. 
Typically, it took approximately 35 min to complete three to four blocks. After 45 min 
at most, the experimental session was interrupted and was continued on the next day 
(and the next, if necessary). Thus, two to three sessions were required for each 
participant to complete the experiment. 

Before starting the experiment, the participants had to undergo training with three 
stimuli to become familiar with the task and stimulus variations. The stimuli consisted 
in the first 90 s from BMG using three very different vibration-generation approaches. 
This training was repeated before each subsession. 

MATLAB was used to control the entire experimental procedure (multimodal 
playback, randomization of stimuli, measurement and calibration of individual 
BRTFs, guided user interface, and data collection). 


Fig. 7.5 Rating scale for Overall Quality 
evaluation of the overall 
quality of the concert 
experience 


Excellent Good Fair Poor Bad 
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7.3 Vibration Generation: Approaches and Results 


Five different approaches to generating vibration stimuli from the audio signal are 
described in this section. The first four approaches were implemented to modify 
mainly the frequency content of the signal. The main target was to reduce higher fre- 
quencies in order to eliminate tingling sensations and to avoid high-frequency sound 
radiation. In Sect. 7.3.1 the effect of simple low-pass filtering is evaluated. Reduction 
of the vibration signal to the fundamental frequency is discussed in Sect.7.3.2. A 
frequency shifting algorithm is applied in Sect. 7.3.3, and substitution with artificial 
vibration signals is discussed in Sect. 7.3.4. In contrast to these frequency-domain 
algorithms, the last approach (described in Sect.7.3.5) targets the dynamic range, 
thus affecting the perceived intensity of the vibration signal. 


7.3.1 Low-Pass Filtering 


The simplest approach would be to route the sound (sum of the three frontal channels 
and LFE channel) directly to the vibration actuator. With some deviations, this pro- 
cess would correspond to the approximately linear transfer functions between sound 
pressure and vibration acceleration measured in real concert venues [27]. However, 
participants typically chose higher vibration levels in the laboratory, which resulted 
in significant sound generation from the actuator, especially in the high-frequency 
range. To address this, the signal was low-pass-filtered using a steep 10th-order But- 
terworth filter with cutoff frequency set to either 100 or 200 Hz, as illustrated in 
Fig. 7.6. However, the spurious sound produced by the vibration system could not be 
completely suppressed. The resulting multimodal sequences were reproduced and 
evaluated in the manner described above. 

For the statistical analysis, the individual quality ratings were interpreted as 
numbers on a linear scale from 0 to 100, respectively corresponding to ‘bad’ and 
‘excellent.’ The data were checked for a sufficiently normal distribution with the 
Kolmogorov—Smirnov test (KS test). A two-factor repeated-measures ANOVA was 
performed using the SPSS statistical software,! which also checks for the homogene- 
ity of variances. The two factors were the played music sequence and the applied 
treatment. Averaged results (20 participants) for the overall quality evaluation are 
plotted in Fig. 7.7 as the mean and 95% confidence intervals. The quality ratings for 
the concert reproduction without vibration are shown on the left. 

Reproduction with vibration was judged to be better than reproduction without 
vibration. Post hoc pairwise comparisons confirmed that both low-pass treatments 
were judged to be better than the reference condition at a highly significant level (p < 
0.01), both with an average difference of 27 scale units, using Bonferroni correction 
for multiple testing. This finding corresponds to approximately one unit on the five- 
point scale shown in Fig.7.5. The effect seems to be strongest for the BMG pop 


"https://en. wikipedia.org/wiki/SPSS. Last accessed on Nov 10, 2017. 
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Fig. 7.6 Signal processing chain to generate vibration signals from the audio sum. The signal 
was filtered with a variable low-pass filter, and the BRTF of the vibration chair was compensated 
individually 
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music sequence; however, no significant effects for differences between sequences 
or interactions between sequences and treatments are observed. 

Using the 200 Hz cutoff frequency, the participants occasionally reported tingling 
sensations on the buttocks or thighs, which only few of them liked. This finding could 
explain the slightly larger confidence intervals for this treatment. 

The positive effect of reproducing vibrations generated by simple low-pass fil- 
tering and the negligible difference between the low-pass frequencies of 100 and 
200 Hz is in agreement with earlier results [25]. 


7.3.2 Reduction to Fundamental Frequency 


In the previous section, low-pass-filtered vibrations were found to be effective for 
multimodal concert reproduction. However, especially for the low-pass 200 Hz con- 
dition, some spurious sound was generated by the vibration system. This fact is 
particularly critical if the audio signal is reproduced for one person via headphones, 


134 S. Merchel and M. E. Altinsoy 


as a second person in the room would be quite disturbed by only hearing the sound 
generated by the shaker. An attempt was undertaken to further reduce such undesired 
sound. This goal could be accomplished, e.g., by insulating the vibrating surfaces 
as much as possible. Because good insulation is difficult to achieve in our case, one 
effective approach would be to reduce the vibration signal to the fundamental spectral 
component contained in the signal. 

A typical tone generated by an instrument consists of a strong fundamental fre- 
quency and several higher-frequency harmonics. If different frequencies are pre- 
sented simultaneously, strong masking effects toward higher frequencies can be 
observed in the tactile domain [14, 41]. It can be assumed that the fundamental 
component considerably masks higher frequencies. Therefore, it might be possible 
to remove the harmonics completely in the vibration-generation process without 
noticeable effects. This approach is illustrated in Fig. 7.8. The fundamentals below 
200 Hz of the summed audio signals were tracked using the Fiddle algorithm [39] in 
Pd, which detects spectral peaks. The cutoff frequency of a first-order low-pass filter 
was then adaptively adjusted to the lowest frequency peak (i.e., the fundamental). 
If no fundamental was detected, the low-pass filter was set to 100 Hz to preserve 
broadband impulsive events. 

The results from the evaluation of the resulting concert reproduction are plot- 
ted in Fig. 7.9. The statistical analysis was executed in the same manner as in the 
previous section. Again, the overall quality of the concert experience improved when 
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Fig. 7.8 Signal processing chain to generate vibration signals from the audio sum. The fundamental 
below 200 Hz was tracked, and an adaptive low-pass filter was adjusted to this frequency to suppress 
all harmonics. If no fundamental was detected, the low-pass filter was set to 100 Hz 
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vibrations were added (very significant, p < 0.01). At the same time, the generation 
of high-frequency components could be reduced, except for conditions in which 
the fundamental frequency approached 200 Hz, e.g., in the VERDI sequence (see 
Fig. 7.1b). For VERDI and DVORAK, some participants again reported tingling 
sensations. For BMG and DVORAK, the participants reported that it was difficult to 
adjust the vibration magnitude because the vibration intensity varied unexpectedly. 

The average difference in perceived quality with and without vibrations was 26 
scale units. Interestingly, the differences between sequences increased. The strongest 
effect was observed for the BMG sequence compared with the other sequences (sig- 
nificant interaction between treatment and sequence, p < 0.05). The spectrogram 
in Fig. 7.2b reveals that for the BMG sequence, the fundamentals always lay below 
100 Hz and the first harmonic almost always lays above 100 Hz. Therefore, the funda- 
mental filtering, as implemented here, almost corresponded to the low-pass-filtering 
condition, with a cutoff at 100Hz. As expected, the resulting overall quality was 
judged to be similar in both cases (no significant difference; compare with Fig. 7.7). 

In addition, Fig. 7.2b reveals that the first harmonic of the electric bass is slightly 
stronger than the fundamental. However, the intensity balance between fundamen- 
tals and harmonics is constant over time, resulting in a good match between sound 
and vibration. This relationship is not the case for the BACH sequence, plotted in 
Fig. 7.la. The intensity of the lowest frequency component is high within the first 10s 
and then suddenly weakens, whereas the intensities of higher frequencies increase 
simultaneously. If only the lowest frequency is reproduced as a vibration, this change 
in balance between frequencies might result in a mismatch between auditory and tac- 
tile perception, which would explain the poor-quality ratings for the BACH sequence 
using the fundamental frequency approach. 

With increasing loudness, the tone color of many instruments is characterized by 
strong harmonics in the frequency spectrum [34]. However, the fundamental does 
not necessarily need to be the most intense component or can be completely missing. 
However, the auditory system still integrates all harmonics into one tone, in which all 
partials contribute to the overall intensity. In addition, different simultaneous tones 
can be played with different intensities depending on the composition. Therefore, 
a more complex approach could be beneficial. The lowest pitch could be estimated 
and used to generate the vibration. However, the intensity of the vibration should still 
depend on the overall loudness within a specific frequency range. In this manner, 
a good match between both modalities might be achieved. However, the process- 
ing is complex and could require greater computing capacity. Better matching the 
intensities appears to be a crucial factor and will be further evaluated in Sect. 7.3.5. 


7.3.3 Octave Shift 


Another approach would be to shift down the frequency spectrum of the vibration 
signal. In this manner, the spurious high-frequency sound could be further reduced 
and the tingling sensation eliminated. 
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Fig. 7.10 Distribution of crossmodal frequency-matched seat vibrations to acoustical tones with 
various frequencies f, according to Altinsoy and Merchel [4] 


The frequency resolution of the tactile sense is considerably worse than that of 
audition [31]; therefore, it might be acceptable to strongly compress vibration signals 
in the frequency domain while still preserving perceptual integration with the respec- 
tive sound. Earlier experiments have been conducted to test whether participants can 
match the frequencies of sinusoidal tones and vibrations presented through a seat [4]. 
The results are summarized in Fig. 7.10. The participants were able to match the fre- 
quencies of both modalities with some tolerance. In most cases, the participants 
also judged the lower octave of the auditory frequency to be suitable as a vibration 
frequency. Therefore, the decision was made to shift all the frequencies down one 
octave, i.e., dividing their original values by two. This shift corresponds to compres- 
sion in the frequency range, with stronger compression toward higher frequencies. As 
shown in Fig. 7.11, before pitch-shifting the original summed audio signal was pre- 
filtered via one of the methods described above (i.e., low-pass filtering or reduction 
to fundamental frequency). Pitch-shifting was performed in Pd using a granular syn- 
thesis approach: The signal was cut into grains of 1000 samples, which were slowed 
by half and summed again using overlapping Hann windows. Using this method, 
some high-frequency artifacts occurred, which were subsequently filtered out using 
an additional low-pass filter set at 100 Hz. The resulting low-pass-shifted vibration 
signals were evaluated as described above. Results are plotted in Fig.7.12. Again, 
the statistical analysis was performed using ANOVA after testing the preconditions. 

For the BACH sequence, shifting the lowest fundamental even farther down 
resulted in generally poor-quality ratings. The occasionally weak fundamental com- 
ponents in this sequence caused crossmodal intensity mismatch between vibration 
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Fig. 7.11 Signal processing chain to generate vibration signals from the audio sum. Compression 
was applied in the frequency range by shifting all of the frequencies down one octave using granular 
synthesis. To suppress high-frequency artifacts, a 100 Hz low-pass filter was subsequently inserted 
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Fig. 7.12 Mean overall quality evaluation for no-vibration and various octave-shift vibration- 
generation approaches, plotted with 95 % confidence intervals 


and sound, which was perceived as louder. However, the perceived quality increases 
with the bandwidth of the signal, i.e., when using pre-filtering with higher cutoff 
frequency, most likely due to a better intensity match between modalities. 

The quality scores for the BMG sequence depend much less on the initial filtering. 
As discussed before, the difference between the ‘fundamental’ condition and the 
‘low-pass 100 Hz’ condition are small. By octave-shifting the signals, the character 
of the vibration changed. Some participants described the vibrations as ‘wavy’ or 
‘bumpy’ rather than as ‘humming,’ as they had previously done. However, many 
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participants liked the varied vibration character, and the averaged quality ratings did 
not change significantly compared with Figs. 7.7 and 7.9. No further improvement 
was found for broader bandwidth of the pre-filtered signal, for the reasons already 
discussed in the previous section. 

Results were significantly different for the DVORAK and VERDI sequences. 
In Sect. 7.3.1, no preference for one of the two low-pass conditions was observed. 
However, when these sequences are additionally shifted in frequency, an increase 
in quality for the 200 Hz low-pass treatment is found, as shown in Fig.7.12. This 
could be explained by considering the periods during which the lowest frequency 
component is greater than 100 Hz (e.g., VERDI second 10-17). By octave-shifting 
these components while retaining their acceleration levels, they become perceptually 
more intense due to the decreasing equal-intensity contours for seat vibrations [30]. 
In addition, the vibrations were reported to cause less tingling. The same result held 
true for octave-shifting the fundamental. 

The dependence of the quality scores on the music sequence and the filtering 
approach was confirmed statistically by the very significant (p < 0.01) effects for 
the factor sequence, the factor treatment, and the interaction of both. On average, 
all of the treatment conditions were judged to be better than without vibrations on a 
very significant level (p < 0.01). No statistically significant differences between the 
‘fundamental’ and the ‘low-pass 100 Hz’ conditions were observed. However, the 
‘low-pass 200 Hz’ condition was judged to be slightly but significantly better (p < 
0.05) than the ‘fundamental’ (averaged difference = 11) and the ‘low-pass 100 Hz’ 
(averaged difference = 9) treatments with octave shifting. As explained above, these 
main effects must be interpreted in the context of the differences between sequences. 

It can be concluded that octave-shifted vibrations appeared to be integrable with 
the respective sound in many cases. The best-quality scores were achieved, indepen- 
dent of the sequence used, by applying a higher low-pass frequency, e.g., 200 Hz. 


7.3.4 Substitute Signals 


It was hypothesized in the previous section that the variance in the vibration char- 
acter that resulted from the frequency shift would not negatively influence the qual- 
ity scores. Thus, it might be possible to compress the frequency range even more. 
This approach was evaluated using several substitute signals and is discussed in 
this section. Figure 7.13 presents the signal processing chain. A signal generator 
was implemented in Pd to produce continuous sinusoidal tones at 20, 40, 80, and 
160 Hz. The frequencies were selected to span a broad frequency range and to be 
clearly distinguishable considering the just-noticeable differences (JNDs) for seat 
vibrations [31]. Additionally, a condition was included using white Gaussian noise 
(WGN) low-pass-filtered at 100 Hz. These substitute signals were then multiplied 
with the amplitude envelope of the original low-pass-filtered signal to retain its tim- 
ing information. An envelope follower was implemented, which calculated the RMS 
amplitude of the input signal using successive analysis windows. Hann windows 


7 Auditory-Tactile Experience of Music 139 


were applied of size equal to 1024 samples, corresponding to approximately 21 ms, 


to avoid smearing the impulsive content. The period for successive analysis was half 
of the window size. 
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Fig. 7.13 Signal processing chain to generate vibration signals from the audio sum. The envelope 
of the low-pass-filtered signal was extracted and multiplied with substitute signals, such as sinusoids 
at 20, 40, 80, and 160 Hz or white noise 
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Fig. 7.14 Mean overall quality evaluation for no-vibration and various substitute vibration- 
generation approaches, plotted with 95 % confidence intervals 


The quality scores are presented in Fig.7.14. An ANOVA was applied for the 
statistical analysis. All of the substitute vibrations, except for the 20 Hz condition, 
were judged to be better than reproduction without vibration at a highly significant 
level (p < 0.01). The average differences, compared with the no-vibration condition, 
were between 29 scale units for the 40 Hz vibration and 18 scale units for WGN 
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and the 160Hz vibration. There was no significant difference between the 20 Hz 
vibration and the no-vibration condition. The participants indicated that the 20 Hz 
vibration was too low in frequency and did not fit with the audio content. In contrast, 
40 and 80Hz appeared to fit well. No complaints about a mismatch between sound 
and vibration were noted. The resulting overall quality was judged to be comparable 
to the low-pass conditions in Fig. 7.7. 

Notably, even the 160 Hz vibration resulted in fair-quality ratings. However, com- 
pared with the 80Hz condition, a trend toward worse judgments was observed 
(p ~ 0.11). A much stronger effect was expected because this vibration frequency is 
relatively high, and tingling effects can occur. There was some disagreement between 
participants, which can be observed in the larger confidence intervals for this condi- 
tion. 

Even more interesting, the reproduction of WGN resulted in fair-quality ratings. 
However, this condition was still judged to be slightly worse than the 40 and 80 Hz 
vibrations (average difference = 11, p < 0.05). The effect was strongest for the 
BACH sequence, which resulted in poor-quality judgments (very significant interac- 
tion between sequence and treatment, p < 0.01). The BACH sequence contained long 
tones that lasted for several seconds, which did not fit with the ‘rattling’ vibrations 
excited by the noise. In contrast, in the BMG, DVORAK, and VERDI sequences, 
impulses and short tones resulted in brief vibration bursts of white noise, which felt 
less like ‘rattling.’ Nevertheless, the character of the bursts was different from sinu- 
soidal excitation. Specifically, in the BMG sequence the amplitude of the transient 
vibrations generated by the bass drum varied depending on the random section of 
the noise. This finding is most likely one of the reasons why the quality judgment for 
BMG in the noise condition tended to be worse compared, e.g., with the approach 
using a 40 Hz vibration. 

Given these observations, it appears that even simple vibration signals can result in 
good reproduction quality. For the tested sequences, amplitude-modulated sinusoids 
at 40 and 80 Hz worked well. 


7.3.5 Compression of Dynamic Range 


In the previous experiments, the overall vibration intensity was adjusted individu- 
ally by each test participant. However, the intensity differences between consecutive 
vibration components or between vibration components at different frequencies were 
kept constant. In the pilot experiments [25], it was reported that expected vibrations 
were sometimes missing. This might be because of the differing frequency-dependent 
thresholds and growth of sensations for the auditory and tactile modality [30]. There- 
fore, an attempt was undertaken to better adapt the signals to the different dynamic 
ranges. 

To better match crossmodally the growth of auditory and tactile sensation with 
increasing sound and vibration intensity, the music signal is compressed in the 
vibration-generation process, as illustrated in Fig. 7.15. As one moves toward lower 
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frequencies, the auditory dynamic range decreases gradually and the growth of sen- 
sation with increasing intensity rises more quickly [44]. In the tactile modality, the 
dynamic range is generally smaller than for audition; however, no strong depen- 
dence on frequency between 10 and 200 Hz was found [30]. Accordingly, there was 
not much variation between frequencies in the growth of sensation of seat vibra- 
tions with increasing intensity. Therefore, less compression seems necessary toward 
lower frequencies. However, a frequency-independent compression algorithm was 
implemented for simplicity. 
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Fig. 7.15 Signal processing chain to generate vibration signals from the audio sum. The low-pass- 
filtered signal was compressed using different compression factors 
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Fig. 7.16 Mean overall quality evaluation for no-vibration and different dynamics compression 
vibration-generation approaches, plotted with 95 % confidence intervals 


The amount of compression needed for ideal intensity matching between both 
modalities was predicted using crossmodal matching data [26]. For moderate sinu- 
soidal signals at 50, 100, and 200 Hz, a 12 dB increase in sound pressure level matched 
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well with an approximately 6dB increase in acceleration level, which corresponds 
to a compression ratio of two. Further, the curve of sensation growth versus sen- 
sation level flattens toward higher sensation levels in the auditory [16] and tactile 
domains [35]. This finding might be important because loud music typically excites 
weak vibrations. The effect can be accounted for by using higher compression ratios. 
Therefore, three compression ratios (two, four, and eight) were selected for testing. 
Attack and release periods of 5 ms were chosen to follow the source signals quickly. 

Statistical analysis was applied as described above using a repeated-measures 
ANOVA and post hoc pairwise comparisons with Bonferroni correction. The quality 
scores for the concert experience using the three compression ratios are plotted in 
Fig. 7.16. Again, the no-vibration condition was used as a reference. Compressing 
the audio signal by a ratio of 2 resulted in significantly improved quality perception 
as compared to the no-vibration condition (average difference = 26, p < 0.01). 
Although the ratings were not statistically better than the 100 Hz low-pass condition 
in Sect. 7.3.1, some test participants reported that the initial-level adjustment was 
easier, particularly for the DVORAK sequence. This finding is plausible because the 
DVORAK sequence covers quite a large dynamic range at low frequencies, which 
might have resulted in missing vibration components if the average amplitude was 
adjusted too low or in mechanical stimulation that was too strong if the average 
amplitude was adjusted too high. Therefore, compressing the dynamic range could 
have made it easier to select an appropriate vibration level. 

Increasing the compression ratio further to 4 or 8 reduced the averaged quality 
scores (average difference between 2 and 4 ratios = 11, p < 0.05; average differ- 
ence between 2 and 8 ratios = 18, p < 0.01). The reason for this decrease in quality 
appeared to be the noise floor of the audio signal, which was also amplified by the 
compression algorithm. This vibration noise was primarily noticeable and disturbing 
during the passages of music with little or no low-frequency content. In particular, 
such passages are found in BACH and VERDI. This fact would explain the bad rat- 
ings for these sequences already with a compression ratio equal to 4. To check this 
hypothesis, the compression ratio was set to 8, this time using a threshold, and tested 
again. Loud sounds above the threshold were compressed, whereas quieter sounds 
remained unaffected. The threshold was adjusted for each sequence so that no vibra- 
tions were perceivable during passages with little frequency content below 100 Hz. 
The resulting perceptual scores are plotted on the right side in Fig. 7.16. The qual- 
ity was judged to be significantly better compared with the no-vibration condition 
(average difference = 34, p < 0.01) and with compression ratios of 4 and 8 without 
a threshold (average difference = 18 and 26, respectively, p < 0.01). However, there 
was no significant difference compared with a compression ratio of 2. These findings 
indicate that even strong compression might be applied to music-induced vibrations 
without impairing the perceived quality of a concert experience. In contrast, com- 
pression appears to reduce the impression of missing vibrations, and thus makes it 
easier to adjust the vibration level. However, a suitable threshold must be selected 
for strong compression ratios. Setting such threshold appears possible if the source 
signal has a wide dynamic range, which is typically the case for classical recordings. 
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In contrast, modern music or movie soundtracks are occasionally already highly 
compressed with unknown compression parameters, which could be problematic. 


7.3.6 Summary 


Various audio-induced vibration-generation approaches have been developed based 
on fundamental knowledge about auditory and tactile perception. The perceived qual- 
ity of concert reproduction using combined loudspeaker sound and seat vibrations 
was evaluated. It can be summarized that seat vibrations can have a considerably 
positive effect on the experience of music. Since the test participants evaluated all 
approaches in completely randomized order, the resulting mean overall quality val- 
ues can be directly compared. The quality scores for concert experiences using some 
of the vibration-generation approaches are summarized in Fig. 7.17 (all judged very 
significantly better than without vibrations, p < 0.01). 
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Fig. 7.17 Mean overall quality evaluation for music reproduction using selected vibration- 
generation approaches. For better illustration, individual data points have been connected with 
lines 
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The low-pass filter approach is most similar to vibrations potentially perceived in 
real concert halls and resulted in good-quality ratings. The approach is not compu- 
tationally intensive and can be recommended for reproduction systems with limited 
processing power. Because the differences between a low-pass filter of 100 Hz and 
200 Hz were small, the lower cutoff frequency is recommended to minimize sound 
generation from the vibration system. With additional processing, the unwanted 
sound can be further reduced while preserving good-quality scores. To this end, 
one successful approach involves compression in the frequency range, e.g., using 
octave shifting. Surprisingly, even strong frequency limitation to a simple amplitude- 
modulated sinusoidal signal seems to be applicable. This allows for much simpler 
and cheaper vibration reproduction systems, e.g., in home cinema scenarios. How- 
ever, some signal processing power is necessary, e.g., to extract the envelope of the 
original signal. Furthermore, it seems useful to apply some dynamic compression, 
which makes it easier to adjust the vibration level. In this study, source signals with 
a high dynamic range have been used as a starting point. Further evaluation using 
audio data whose dynamics are already compressed with unknown parameters is 
necessary. 

Participants usually chose higher acceleration levels in the laboratory compared 
to measurements in real concert situations. It can be hypothesized that the absolute 
acceleration level influences the perceived quality of a concert experience. This 
question should be examined in a further study. 

In summary, test participants seemed to be relatively tolerant to a wide range 
of music and seat vibration combinations. Perhaps our real-life experience with the 
simultaneous perception of auditory and tactile events is varied and expectations 
are therefore not strictly determined. For example, the intensity of audio-related 
vibrations might vary heavily between different concert venues. Additionally, various 
aspects of tactile perception are less refined than for audition. In particular, frequency 
resolution and pitch perception are strongly restricted [42] for touch, which allows 
the modification of frequency content within a wide range. 

The effect of additional vibration reproduction depended to some extent on the 
selected music sequence. For example, the BMG rock music sequence was judged 
significantly better in most of the cases including vibrations than the classical compo- 
sitions (see Fig. 7.17). This seems plausible because we expect strong audio-induced 
vibrations at rock concerts. However, adding vibrations seems to clearly increase the 
perceived concert quality, even for classical pieces of music. 


7.4 Conclusions 


It has been shown in this chapter that there is a general connection between vibra- 
tions and the perceived quality of music reproduction. However, in this study only 
seat vibrations have been addressed, and a 5.1 surround sound setup was used. 


7 Auditory-Tactile Experience of Music 145 


Interestingly, none of the participants complained about an implausible concert expe- 
rience. Still, one could question whether the 5.1 reproduction situation can be com- 
pared with a live situation in a concert hall or church. Because test participants 
preferred generally higher acceleration levels, it is hypothesized that real halls could 
benefit from amplifying the vibrations in the auditorium. This could be achieved 
passively, e.g., by manipulating floor construction, or actively using electrodynamic 
exciters as in the described experiments. Indeed, in future experiments it would be 
interesting to investigate the effect of additional vibration in a real concert situa- 
tion. Also, the vibration system could be hidden from participants in order to avoid 
possible biasing effects. 

During the experiments, the test participants sometimes indicated that the vibra- 
tions felt like tingling. This effect could be reduced by removing higher frequencies 
or shifting them down. However, this processing also weakened the perceived tactile 
intensity of broadband transients. The question arises, what relevance do transients 
have for the perceived quality of music compared with steady-state vibrations? One 
approach to reduce the tingling sensations for steady-state tones and simultaneously 
keep transients unaffected would be to fade continuous vibrations with a long attack 
and a short release using a compressor. This type of temporal processing appears to 
be promising based on an unpublished pilot study and should be further evaluated. 

Another approach for conveying audio-related vibration would be to code audi- 
tory pitch information into a different tactile dimension. For example, it would be 
possible to transform the pitch of a melody into the location of vibration along the 
forearm, tongue, or back using multiple vibration actuators. This frequency-to-place 
transformation approach is usually applied in the context of tactile hearing aids, in 
which the tactile channel is used to replace the corrupt auditory perception [20, 40]. 
However, in such sensory substitution systems, the transformation code needs to be 
learned. It has been shown in this study that it might not be necessary to code all 
available auditory information into the tactile channel to improve the perceived qual- 
ity of music. Still, there is creative potential using this approach, which was applied 
in several projects [10, 11, 15]. 

Another interesting effect is the influence of vibrations on loudness perception at 
low frequencies, the so-called auditory-tactile loudness illusion [33]. It was demon- 
strated that tones were perceived to be louder when vibrations were reproduced 
simultaneously via a seat. This illusion can be used to reduce the bass level in a 
discotheque or an automobile entertainment system [29] and might have an effect on 
the ideal low-frequency audio equalization in a music reproduction scenario. 
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Chapter 8 A) 
The MSCI Platform: A Framework ciecie; 
for the Design and Simulation 

of Multisensory Virtual Musical 

Instruments 


James Leonard, Nicolas Castagné, Claude Cadoz and Annie Luciani 


Abstract This chapter presents recent work concerning physically modelled virtual 
musical instruments and force feedback. Firstly, we discuss fundamental differences 
in the gesture—sound relationship between acoustic instruments and digital musical 
instruments, the former being linked by dynamic physical coupling, the latter by 
transmission and processing of information and control signals. We then present an 
approach that allows experiencing physical coupling with virtual instruments, using 
the CORDIS-ANIMA physical modelling formalism, synchronous computation and 
force-feedback devices. To this end, we introduce a framework for the creation and 
manipulation of multisensory virtual instruments, called the MSCI platform. In par- 
ticular, we elaborate on the cohabitation, within a single physical model, of sections 
simulated at different rates. Finally, we discuss the relevance of creating virtual 
musical instruments in this manner, and we consider their use in live performance. 


8.1 Introduction 


Computers have deeply changed our way of thinking, working, communicating and 
creating. The musical world is no exception to this transformation, whether in popular 
music—which now relies predominantly on electronic means—or in the processes 
of many modern composers who use software tools to address formal compositional 
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problems, and to capture, synthesise, process and manipulate sound. The rapid 
advances in computer technology now enable real-time computing and interactive 
control of complex digital sound synthesis and processing algorithms. When cou- 
pled with interfaces that capture musical gestures and map them to the algorithms’ 
parameters, such systems are named digital musical instruments (DMIs). They are 
now widespread musical tools and allow for a true form of virtuosity. 

However, a fundamental question arises as to the relationship between a musi- 
cian and a DMI: is it of a similar nature to the relationship that is established with 
conventional instruments? This question is complex, especially given the available 
panoply of synthesis techniques and control paradigms. Moreover, digital synthesis 
brings forth an array of new possibilities for controlling musical timbres, as well as 
their arrangement at a macro-structural level. It is quite legitimate to ask oneself if 
these tools should be envisaged by analogy to acoustical instruments, e.g. if they 
should offer means of manipulation analogous to traditional instruments, or if they 
require entirely new control and interaction paradigms. 

This issue finally questions the very definition of musical instrument: can (and 
should) a digital interface controlling a real-time sound synthesis process be called 
an instrument, in the sense that it enables an embodiment comparable to traditional 
instruments? Can DMIs and conventional instruments be grouped into the same 
category? Also, is controlling digital synthesis by imitating the way we interact with 
traditional instruments the most effective approach? 

We discuss these issues by considering that the recreation of the physical instru- 
mental relationship between musicians and DMIs is indeed relevant (see Chap. 2). 
When a digital sound synthesis process is physically based (i.e. relying on physical 
laws to create representations of sound-producing virtual objects), a bidirectional link 
between gesture and sound can be established that coherently transforms mechanical 
energy provided by the user into airborne vibrations of the virtual instrument. Such 
is the case in acoustical and electroacoustical instruments, referred to by Cadoz as 
the ergotic function of instrumental gestures [6], and has been proven a key factor 
in their expressiveness [24, 33]. 

The design of DMIs addressing these issues calls for: 


e Specific physical modelling and simulation paradigms for digital sound synthesis, 
in order to design and simulate the dynamics of virtual vibrating objects and 
mechanical systems. 

e The use of adequate force-feedback technologies to enable energetic coupling 
between the user and the simulated instrument. 

e Software and hardware solutions to run such physical models synchronously and 
in real time, at rates of several kHz for the user instrument control chain, and at 
audio rates (typically 44.1 kHz or higher) for the acoustical components. 

e Tools to physically model the instrument and the various mechanical features that 
define its ergonomics and playability. 


Our answer to these requirements is the Modeleur-Simulateur pour la Création 
Instrumentale (MSCI) platform, a complete workstation for designing and crafting 
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physics-based multisensory virtual musical instruments and for playing them with 
force feedback. 

The following sections will present: (a) the specifics of multisensory virtual musi- 
cal instruments, (b) hardware and software design for the MSCI platform, (c) consid- 
erations for modelling the mechanics of musical instruments and their decomposition 
into sections simulated at different rates and (d) use of the platform as a creative tool, 
including the first use of the MSCI platform by Claude Cadoz in a live performance. 


8.2 A Physical Approach to Digital Musical Instruments 


The incorporation of haptic devices into musical applications has become a regular 
feature in the field of computer music, be it by using force-feedback systems or 
vibrotactile actuators—now present in widespread consumer electronics (common 
actuators technology is described in Sect. 13.2). Devices are becoming more afford- 
able, and a wide number of studies point towards the benefits yielded by such systems 
in terms of control and manipulation for musical tasks [2, 3, 16, 24, 27—29] (see also 
Chap. 6). 

Two main approaches for integrating haptics in digital instrumental performance 
can be distinguished: (i) augmenting DMIs with haptic feedback to enhance their 
control and convey information to the user, or (ii) making a virtual instrument tan- 
gible by enabling gestural interaction with a haptic representation of all or part of 
the instrument’s mechanical features. Concerning the latter case, at least two sub- 
categories can be described, namely: (ii-a) the distributed approach, in which the user 
interacts haptically with a model of the gestural interface of the instrument, which 
in turn controls the sound synthesis process through feed-forward mapping strate- 
gies (historically referred to as multimodal approach at ACROE-ICA), and (ii-b) the 
unitary approach, in which the entire instrument is represented by a single physical 
model that is used to render audio, haptic and possibly even visual feedback (we 
refer to this single-model scenario as multisensory). 


8.2.1 Distributed Approach to Haptic Digital Musical 
Instruments 


The distributed (or multi-model) approach to haptic DMIs follows the classic decom- 
position into gestural controller and sound synthesis sections [33]. The haptic, aural 
and sometimes visual stimuli are physically decoupled from each other, due to the 
distributed architecture of the instrument (see Fig. 8.1). Haptic feedback incorpo- 
rated into the gestural controller enables coupling with certain components of the 
DMI, for instance, by programming the mechanical behaviour of the gestural control 
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Fig. 8.1 Distributed approach to haptic digital musical instruments 


section using a local haptic model. Data extracted from the interaction between the 
user and this model can then be mapped to chosen sound synthesis parameters. 

Some examples of this approach are the Virtual Piano Action by Gillespie [15], 
or the DIMPLE software [30] in which the user interacts with a rigid dynamics 
model, and information concerning this interaction (positions, collisions, etc.) is 
then mapped to an arbitrary sound synthesis process, possibly a physically based 
simulation. 

Vibrotactile feedback inferred from the sound synthesis process itself can be 
provided to the user by integrating vibration actuators into the gestural controller. 
Such is the case of Nichols’ vBow friction-driven haptic interface [26] or Marshall’s 
vibrotactile feedback incorporated directly into DMI controllers [25]. 

Technical implementations of these systems generally rely on asynchronous com- 
putation loops for haptics and sound, employing low- to mid-priced haptic devices 
such as the Phantom Omni or the Novint Falcon. While these systems tend to bridge 
the gap between gestural control section and sound synthesis, the sound is still driven 
by mapping of sensor data, and the user physically interacts only with a local sub- 
section of the instrument. 


8.2.2 Unitary Approach to Virtual Musical Instruments 


An alternative approach to implementing haptic DMIs is to model the virtual instru- 
ment as a single multisensory physical object that jointly bears mechanical, acousti- 
cal and possibly visual properties, inherent to its physical nature. Physical modelling 
techniques are then the only viable approach. As a result, the gestural controller and 
sound synthesis sections are tightly interconnected: haptic interaction with one part 
of the instrument will affect it as a whole, and the player is haptically coupled with 
a complete single model (see Fig. 8.2 and Chap. 2). 
Making use of this approach, one can distinguish: 
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Fig. 8.2 Unitary approach to haptic digital musical instruments 


Works such as [4, 11,29] enable haptic interaction with a sound-producing physical 
model, and rely on computation schemes and hardware technologies comparable 
to those described in Sect. 8.2.1. Generally, these works employ fairly cheap haptic 
devices, limited in terms of reactivity and peak force feedback. Also, the compu- 
tation of the interaction is done in soft real time, often employing asynchronous 
protocols such as OSC [30] or MIDI [29]. While they do enable direct haptic inter- 
facing with physical models, these systems do not strive for rigorous and coherent 
energy exchange between the musician and the virtual instrument. 

Others [13, 19, 20, 31] aim to model and reproduce the physical coupling 
between musician and traditional instrument as accurately as possible, includ- 
ing the exchange of energy between the two. To this end, high-performance haptic 
interfaces and synchronous high-speed computational loops are required. Such sys- 
tems aim to capture the feel, playability and expressiveness of traditional instru- 
ments, while opening to the creative possibilities of physical modelling sound 
synthesis, and more generally of the computer as an instrument. 


MSCI fits into the latter category. The platform provides a musician-friendly phys- 
ical modelling environment in which users can design virtual musical instruments, 
and allows unified multisensory interaction by simulating those instruments on a 
dedicated workstation that supplies coherent aural, visual and haptic feedback. 


8.3 Hardware and Software Solutions for the MSCI 
Platform 


8.3.1 The TGR Haptic System 


The transducteur gestuel rétroactif (TGR) is a force-feedback device designed by the 
ACROE-ICA laboratory (Fig. 8.3). The first prototype was proposed by Florens in 
1978 [12], conceived specifically for the requirements of artistic creation, in particular 
for instrumental arts such as music. The first goal of the TGR is to render the dynamic 
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Fig. 8.3 TGR haptic device. Left: a bowing end-effector; right: a 12 key TGR with keyboard 
end-effectors 


qualities of mechanical interactions with simulated objects with the highest possible 
fidelity: to this end, it offers both a high mechanical bandwidth (up to 15 kHz) and 
high peak force feedback (up to 200 N per degree of freedom). 

Several slices (1-DoF modular electromechanical systems comprised of a sensor 
and an actuator) can be combined allowing for any number of force-feedback-enabled 
degrees of freedom [14]. The device employed in the MSCI platform gathers 12 
independent modules that can be combined with various mechanical end-effectors, 
forming 1D, 2D, 3D or even 6D morphological configurations, adapted to the diverse 
nature of instrumental gestures such as striking, bowing, plucking, grasping. 


8.3.2 The CORDIS-ANIMA Formalism 


CORDIS-ANIMA [5] is a modular formalism for modelling and simulating mass- 
interaction networks—that is physical models described by Newtonian point-based 
mechanics. It defines two main module types: 


e <MAT> modules: It represents material points that update their position in space 
in response to the force they receive, according to their inertial behaviour. The 
simplest of these is a punctual mass. 

e <LIA> modules: It represents interactions between two <MAT> modules. The 
interaction can be elastic, viscous, nonlinear, etc. A<LIA> connects two <MAT> 
modules and calculates the interaction force between them, depending on their rel- 
ative positions (for elastic interactions) or velocities (for viscous interactions). The 
calculated force is then applied symmetrically to each of the connected <MAT> 
modules. 


CORDIS-ANIMA incorporates the notion of physical coupling between networks 
of elementary modules through the interdependence of two dual variables: position, 
an extensive variable that gives <MAT> modules a position in space, and force, 
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an intensive variable that originates from interactions between <MAT> modules 
described by <LIA> modules. Computing the network requires a closed-loop calcu- 
lation: first, of the new positions of <MAT> modules, and second, of all the forces 
produced by the <LIA> modules according to the new positions of the <MAT> 
modules that they are connected to (Fig. 8.4). 

Several CORDIS-ANIMA implementations are declined for different geometrical 
spaces: 1D with scalar distances, or 1D, 2D and 3D with Euclidian distances. The 
1D scalar distance version is generally used to simulate vibroacoustic deformations 
in which all <MAT> modules move along a single scalar axis. Models built in this 
way are topological networks that may represent a first-order approach to vibratory 
deformations as found in musical instruments—a simplification that works well in 
most cases. 

For sound-producing physical models, networks must be simulated at audio-rate 
frequency (generally set at 44.1 kHz) in order to faithfully represent acoustical 
deformations that occur in the audible range (up to 20 kHz). Non-vibrating models, 
designed to, e.g. produce visual motion or mechanical systems, are often simulated 
in 1D, 2D or 3D geometrical spaces and at lower frequencies in the range 1—10 kHz, 
a bandwidth suited to instrumental performance. 

The TGR haptic device is represented in CORDIS-ANIMA as a <MAT> module: 
this reports positions taken from its sensors and receives forces from the connected 
<LIA> modules which are then sent to the TGR’s actuators. 


8.3.3. The GENESIS Software Environment 


GENESIS [9] is ACROE-ICA’s modelling and simulation software for musical cre- 
ation. It allows to model vibrating objects—from elementary oscillators to complex 
musical scenes—and to simulate them off-line at 44.1 kHz. GENESIS implements 
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Fig. 8.5 Representation of a physical model in the GENESIS environment 


Fig. 8.6 Simulation of a GENESIS model, showing displacement along the x-axis 


the 1D version of CORDIS-ANIMA, meaning that all <MAT> physical modules 
move along a single scalar axis conventionally labelled x. 

The modelling interface consists in a workbench representing the y-z plane, where 
<MAT> modules can be placed and interconnected through <LIA> modules to form 
topological networks (Figs. 8.5 and 8.6). Modules are given physical parameters that 
dictate their physical behaviour and initial conditions (initial position and speed of 
<MAT> modules). 


8.3.4 Synchronous Real-Time Computing Architecture 


The vast majority of available haptic devices communicate asynchronously with 
physical simulations [1 1, 30]. Generally, the haptic loop runs locally at approximately 
1 kHz, whereas other model components are computed with a lower rate and low 
demanding latency constraints, following a distributed approach. Current general- 
purpose computer architectures are perfectly suited for these applications. However, 
when striving for energetically coherent instrumental interaction between the user 
and the simulated object, the communication between the haptic device and the 
simulation plays a key role. 

As underlined in Sect. 8.3.2, the global physical entity composed of the force- 
feedback device and virtual object can be defined as a physical, energy-conserving 
system only if the haptic position and force data streams integrate seamlessly into 
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Fig. 8.7 Hardware and software architecture of the MSCI platform 


the CORDIS-ANIMA closed-loop simulation. To this end, the haptic loop must run 
synchronously at the rate of the physical simulation, with single-sample latency 
between its position output and force input. For simulations running at several kHz, 
the time step (approximately 20-100 pus) within which AD/DA conversions, bidirec- 
tional communication with the haptic device and a single computation loop for the 
whole physical model must occur imposes a reactive computing architecture with 
guaranteed response time, which is not attainable by general-purpose machines [10]. 

Additionally, the simulation of physical models sufficiently complex for musical 
purposes is computationally demanding and therefore ill-fitted for calculation on 
most current embedded systems. A previous simulation architecture at ACROE-ICA 
[19] was based on the TORO board from Innovative Integration; while it allowed run- 
ning the haptic loop synchronously at audio rate (44.1 kHz), the available processing 
power limited the system to small-scale physical models [20]. 

The hardware and software architecture of the MSCI platform (shown in Fig. 8.7) 
consequently addresses both the need for high computing power and reactive I/O. 
It is based on the RedHawk Linux real-time operating system (RTOS), where the 
physical simulation is computed in two sections: one running at audio rate (44.1 kHz) 
and the other running at control (gestural) rate (1-10 kHz). The TORO DSP board 
serves as a front-end for haptic I/O. Sound is handled by an external soundcard. 
These components are synchronised through a shared master clock (the soundcard’s 
wordclock). Visualisation data, on the other hand, is processed asynchronously so as 
to display the physical model during simulation. 

This platform can simulate virtual scenes with up to 7000 interacting audio-rate 
physical modules: an approximate performance gain by a factor of 50 compared to 
the previous embedded architecture. 
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Fig. 8.8 Analysis of the musician/instrument ensemble as a dynamical system 


8.4 Multi-rate Decomposition of the Instrumental Chain 


The MSCI architecture is based on the idea of decomposing a physical model into 
a section running at audio rate and another one running at a lower gestural rate. 
In what follows we discuss the motivations for this decomposition, and how it can 
be addressed in the CORDIS-ANIMA framework while retaining physical coupling 
between the two sections of the physical model. 


8.4.1 Gesture-—Sound Dynamics 


The mechanics of traditional instruments present a natural cohabitation of multiple 
dynamics. In particular, instruments can be generally separated into: 


e A section that is interfaced with the musician’s gestures, which we label excitation 
structure. This section is mostly non-vibrating, and its frequency bandwidth is 
comparable to that of human instrumental gestures. Examples of this section are 
the piano key mechanics, the violin bow, a percussion mallet, a guitar pick, a harp 
or timpani pedal. 

e A section that produces vibroacoustic deformations, called the vibrating structure. 
This corresponds to the strings and body of the piano or violin, or the drum head for 
a percussion instrument. This is often completed with other components operating 
at acoustic rates, such as a bridge or a sound box. 


These two sections are coupled by means of nonlinear interactions (percussion, 
friction, plucking, etc.) that transform low-bandwidth gesture energy into high- 
bandwidth energy of acoustical vibrations (Fig. 8.8). 

Since these two sections of an instrument operate at different frequency rates, it 
comes naturally to simulate their discrete-time representations at different sampling 
rates. While this results in computational optimisation, a major issue arises: how to 
retain coherent physical coupling between the low-rate and high-rate sections of the 
instrument and at the same time meet the constraints of synchronous simulation? 
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8.4.2 Multi-rate CORDIS-ANIMA Simulations 


8.4.2.1 Multi-rate Closed-Loop Dynamic Systems 


The physical coupling between two sections of aCORDIS-ANIMA model simulated 
at different rates brings forth two main questions: (i) how to ensure transparent 
communication of the position and force variables between the two discrete-time 
systems in order to represent the physical coupling between them, and (ii) how to 
limit the bandwidth of position and force signals when transiting from one simulation 
space to the other? For instance, if no band-limiting is applied to the higher rate signals 
before passing them to the low-rate section, aliasing is produced. 

At first glance, the latter seems to be an elementary signal processing issue, solv- 
able by using up- and down-sampling and low-pass digital filtering. However, the 
physical simulation imposes strict constraints on the operators that can be used: it is a 
closed-loop system in which force and position variables are coupled within a single 
simulation step. In other words, a maximum delay of one sample is allowed between 
all the inputs and outputs, while any additional delay alters the physical consistency 
of the system and considerably affects the numerical stability of the simulation [22]. 
This prevents using many standard signal processing tools for up- and down-sampling 
and digital filtering, as the vast majority of them introduce additional delays. 


8.4.2.2 Inter-Frequency Coupling Operators 


To address the above issue, up- and down-sampling of position and force variables 
travelling between the high- and low-rate sections must rely on delay-free (zero- 
order) operators, even though they necessarily introduce a trade-off in terms of 
quality of the reconstructed signals. The operators were chosen in accordance with the 
nature of the variables and their integration into the CORDIS-ANIMA computational 
scheme, so as to preserve the integrity of the physical quantities circulating inside 
the multi-rate simulation. 

The two types of connections allowed by these operators are given in Fig. 8.9, 
where X‘" and F* represent, respectively, the low-rate position and force signals, 
whereas X” and F“" represent the high-rate signals. Since no delay is introduced, 
the closed-loop nature of the simulation is preserved. 

Theory and experiments demonstrate that a multi-rate model implemented in this 
manner behaves identically to an equivalent low-rate model in terms of numerical 
stability, provided that the model operates only in the lower frequency range. How- 
ever, an inevitable consequence of using these operators is that high-rate signals are 
distorted. If left untreated, these distortions make the system completely unusable. 
Consequently, a solution has to be found to filter out unwanted artefacts, while once 
again avoiding any delay in the position—force closed loop. 
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Fig. 8.9 Two inter-frequency coupling schemes with delay-free up- and down-sampling operators 
(HF stands for high frequency, LF for low frequency) 


8.4.2.3 Low-Pass Filtering by Means of Physical Models 


Fortunately, CORDIS-ANIMA models can act as filtering structures [18]. As a basic 
example, a simple mass-spring oscillator excited by an input force signal can be 
regarded as a second-order low-pass filter whose transfer function can be expressed 
explicitly in terms of physical parameters [17]. This property has, for instance, been 
used to build small virtual physical systems that smooth noise in the position data 
provided by the TGR’s sensors [19]. 

It is thus possible to design physical low-pass filters that are as transparent as pos- 
sible within the low-rate bandwidth, and present a sharp cut-off before the low-rate 
Nyquist frequency. We have modelled such filters as propagation lines (mass—spring 
chaplets) with specific mass, stiffness and damping distribution. They are used to 
eliminate distortion generated by the up-sampling operators and serve as anti-aliasing 
filters for the circulation of high-rate signals towards low-rate sections, while pre- 
serving physical consistency. Careful tuning and scaling of the filtering structures 
ensure minimal impact on the mechanical properties of the simulated object (e.g., in 
terms of added stiffness, damping and inertia). 


8.4.2.4 Complete Multi-rate Haptic Simulation Chain 


Figure 8.10 presents the complete multi-rate haptic simulation chain as implemented 
in the MSCI platform. An instrument is decomposed into a lower bandwidth ges- 
tural section and higher bandwidth vibrating section, simulated synchronously at 
audio rate. The two sections are coupled through multi-rate operators, a filtering 
mechanism and a nonlinear interaction that transform gestural energy into vibroa- 
coustic deformations. Physical energy is conserved throughout the system, ensuring 
computation stability. 
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Fig. 8.10 Complete multi-rate haptic simulation chain of the MSCI Platform 


These solutions combined allow establishing a true energetic bridge between the 
real-world user and the simulated instrument, supporting the ergotic function of 
musical gestures, as defined by Cadoz and Wanderley [6, 7] (see also Chap. 2). 


8.5 Virtual Instruments Created with MSCI 


8.5.1 Workflow and Design Process 


Creating physical models in MSCI is similar to classic modelling with GENESIS, 
especially concerning the design of vibrating sections of the instrument. The haptic 
device is integrated directly into the CORDIS-ANIMA model as a series of TGR 
<MAT> modules, one for each allocated 1-DoF. However, designing haptic DMIs in 
this way presents a number of specific concerns: 


e Models should be designed as stable passive physical systems. Not meeting these 
requirements may result in undesirable and potentially dangerous instabilities 
of the haptic device—although this may occasionally yield interesting musical 
results. 

e The feel of the instrument perceived by the player is at least as important as 
the sound resulting from the interaction. It is therefore necessary to adapt the 
mechanical impedance of the interface between the real world and the simulation, 
by setting the dual constraints of position and force-feedback range according to 
both the model and the interaction(s). See also Chap. 2 in this regard. 

e Connecting a haptic device to a virtual instrument may forward hardware-related 
issues into the simulation domain. For instance, noise from the haptic device’s 
sensors may propagate through to the virtual instrument’s vibrating components. 


Details concerning calibration and impedance matching are described in [20], and 
various instrument designs are discussed in [21]. 
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Fig. 8.11 A large-scale 
instrument (a suspended 
plate) designed in MSCI, 
stuck in different locations 
by six TGR keys (located in 
the top-left corner) 


8.5.2 Specificities of MSCI Haptic Virtual Instruments 


Since the first release of the MSCI platform in 2015, over 100 virtual instruments have 
been created by the authors, students and the general public. The computing power of 
modern systems has allowed for the first time to simulate and interact haptically with 
large-scale instruments composed of thousands of interacting modules. Figure 8.11 
shows an example of such models. Especially for large structures with nonlinear 
acoustical behaviour—such as membranes or cymbals—exploration through real- 
time manipulation greatly facilitates the iterative design and fine-tuning process. 


One notable feature of MSCI’s models is their rich and complex response to 
different categories of musical gestures [6]. Indeed, as the entire instrument is mod- 
elled physically with CORDIS-ANIMA, the user has access to each single simulated 
point of physical matter. This is not possible in more encapsulated or global physical 
modelling techniques such as digital waveguides [32] or modal synthesis [1]. This 
allows for subtle and complex control of the virtual instrument using various haptic 
modules for different gestures. In the case of a simple string, the excitation gesture 
could be, e.g. plucking, striking or bowing, whereas modification gestures could be, 
e.g. pinning down the string onto the fretboard to change its length and pitch (as 
shown in Fig. 8.12), gently applying pressure onto specific points of the open string 
to obtain natural harmonics, applying pressure near the bridge to “palm mute” the 
string or even dynamically move the bridge or the tuning peg of the string to change 
its acoustical properties over time. 

Demonstration sessions and feedback from users tend to strongly confirm the 
importance of tight physical interaction with the virtual instruments. Even the sim- 
plest models can yield a wide palette of sonic possibilities, often leading users to 
spend a fairly long time (up to 30 min) exploring the dynamics, playing modalities 
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Fig. 8.12 Plucked string model. Above: during plucking interaction; middle: open vibration of the 
string; below: pinning the string onto a fretboard, shortening its vibrating length 


and haptic response of a single instrument. This fine degree of control enables an 
enactive learning process of getting to know an object (a virtual instrument in this 
case) through physical manipulation. 


8.5.3 Real-Time Performance in Hélios 


Hélios is an interactive musical and visual piece that was created for the AST 2015 

festival. For the first time, an MSCI force-feedback station was used in a public 
live performance. The entire musical content and the visual scenes are created 
with GENESIS, associating a vast pre-calculated physical model with a real-time 
MSCI simulation. Video content is projected onto two screens: a large screen for the 


1 Art—Science—Technologie—November 14-21, 2015—Grenoble, France. 
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Fig. 8.13 Complete physical model for Hélios (approximately 200000 modules) 


calculated visual scenes and a screen for the real-time visuals associated with the 
MSCI simulation. The sound projection is handled with a sound dome of 24 speakers, 
placed in a semi-sphere above the audience. 

The pre-calculated virtual instrumental scene in Hélios is composed of approxi- 
mately 200000 GENESIS modules (Fig. 8.13). The off-line simulation of this vast 
instrumental scene allows: 


e to distribute the sonic output to 24 audio channels routed to 24 loudspeakers during 
the concert; 

e to memorise the entire 3D visual scene, including (low-rate and vibratory) motion 
of all of the virtual instruments (using the GMDL format [23]). The scene may be 
navigated through during playback using ordinary gestural input systems such as a 
mouse. This “interpreted” playback can then be recorded, edited and incorporated 
into the video projection during the concert. 


The MSCI system incorporated into the installation uses a 12 DoF force-feedback 
device (Fig. 8.3). A model made of approximately 7000 physical modules is loaded 
onto the MSCI workstation. This model is a subgroup of the entire model, which 
guarantees coherency between the sound textures produced by the off-line simulation 
and those produced during the real-time interaction with the MSCI virtual instrument. 
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This fusion blurs the boundaries between off-line and real-time sections and offers 
rich possibilities for the composition and musical structure in the temporal, spatial 
and structural dimensions of the piece. 

The described configuration illustrates one of the many possible interaction sce- 
narios between real and virtual players, real and virtual instruments, real-time and 
off-line (“supra-instrumental’”’) instrumental situations, as previously described by 
Cadoz [8]. 


8.6 Conclusions 


We have presented and discussed recent solutions developed at ACROE-ICA for 
designing and implementing multisensory virtual musical instruments. These con- 
verged into the MSCI platform, the first modelling and simulation environment of its 
kind, enabling large-scale computation of physical models and synchronous high- 
performance haptic interaction that retains the ergotic qualities of musical gestures 
in a digital context. 

Several scientific and technological questions have been addressed by this work, 
in particular concerning the formalisation and implementation of physical models 
containing sections running at different rates. The models created so far and feedback 
from users lead us to believe that MSCI offers high potential as a musical meta- 
instrument, and that it is suitable for use in live performances, as demonstrated by 
Claude Cadoz in his two representations of Hélios. 

Further developments will include incorporating mixed interaction between user 
manipulation and virtual agents inside the physical models. Most importantly, MSCI 
will be used in various creative contexts by musicians and composers and in peda- 
gogical contexts to teach about physics, acoustics and haptics. 
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Chapter 9 A) 
Force-Feedback Instruments cia 
for the Laptop Orchestra of Louisiana 


Edgar Berdahl, Andrew Pfalz, Michael Blandino 
and Stephen David Beck 


Abstract Digital musical instruments yielding force feedback were designed and 
employed in a case study with the Laptop Orchestra of Louisiana. The advantages 
of force feedback are illuminated through the creation of a series of musical compo- 
sitions. Based on these and a small number of other prior music compositions, the 
following compositional approaches are recommended: providing performers with 
precise, physically intuitive, and reconfigurable controls, using traditional controls 
alongside force-feedback controls as appropriate, and designing timbres that sound 
uncannily familiar but are nonetheless novel. Video-recorded performances illustrate 
these approaches, which are discussed by the composers. 


9.1 Introduction 


Applications of force feedback for designing musical instruments have been stud- 
ied since as early as 1978 at ACROE [14, 17, 21, 36] (Chap. 8 reports on recent 
advancements). Such works provide a crucial reference for understanding the role 
that haptic technology can play in music, and these are described in detail in a pre- 
ceding chapter. The wider computer music community has demonstrated a sustained 
interest in incorporating force-feedback technology into musical works and projects. 
This has been evidenced by a series of projects during recent decades. 
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Gillespie et al. have created some high-quality custom force-feedback devices 
and used them for simulating the action of a piano key [24, 26]. Verplank and 
colleagues, and Oboe et al. have initiated separate efforts in repurposing old hard 
drives into force-feedback devices for music [43, 55]. More recently, the work by 
Verplank and colleagues has been extended via a collaboration with Bak and Gauthier 
[2]. Several human-computer interface researchers have experimented with using 
motorized faders for rendering force feedback [48], even for audio applications [1, 
23, 54]. The implementation of a force-feedback bowed string has also been studied 
in detail using various force-feedback devices [21, 37, 42, 49]. 

More recently, Kontogeorgakopoulos et al. have studied how to realize digital 
audio effects with physics-based models, for the purpose of creating force-feedback 
musical instruments [32, 33]. Also, Hayes has endowed digital musical instruments 
(DMIs) with force feedback using the NovInt Falcon [28]. Most recently, Battey 
et al. have studied how to realize generative music systems using force-feedback 
controllers [3]. 


9.1.1 Multisensory Feedback for Musical Instruments 


As described in Chap. 2, when a performer plays a traditional musical instrument, he 
or she typically receives auditory, visual, and haptic feedback from the instrument. 
By integrating information from these feedback modalities together [15, 39], the 
performer can more precisely control the effect of the mechanical excitation that he 
or she provides to the instrument (see Fig. 9.1). 

Most digital musical instruments have primarily aimed at providing auditory and 
visual feedback [40]. However, haptic force feedback is an intriguing additional 
modality that can provide performers with enhanced feedback from a DMI. It has 
advantages such as the following: 


It can provide information separately from the auditory and visual modalities as 
depicted in Fig. 9.1—for example, a performer may be busy looking at a score and 
want to be able to feel the instrument to find the specific buttons or keys to press. 
Haptic information can be delivered directly to locally relevant parts of the human 
body. 

Digital interactions can potentially be made more intuitive (potentially preventing 
sensory overload [31]) by providing feedback resembling familiar interactions in 
the real world. 

Haptic devices are highly reconfigurable, so the feel of a haptic musical instrument 
can be widely and greatly customized depending on what mode it is in. 

Based on what reported in Chap.5 for traditional instruments, when applied 
carefully, haptic feedback can provide further benefits such as enhanced user 
satisfaction, enhanced comfort/aesthetics, and/or a channel for sending private 
communications [31]. 
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Fig. 9.1 When a performer 
plays a traditional musical e- 
instrument, he or she G 
receives auditory, visual, and 
haptic feedback. The 
performer integrates 
information together from 
these “multisensory” 
feedback channels [15, 39] 
while giving a mechanical 
excitation back to the 
musical instrument in 


Visual 


Auditory 
Haptic Force Feedback 
Musical 
BS Instrument 
| |] 


response Mechanical excitation 


e The human reaction time can be shorter for haptic feedback than for any other 
feedback modality [47]. 

e Accordingly, due to the decreased phase lag in the reaction time, feedback control 
theory predicts that musicians could potentially play digital musical interfaces 
more accurately at faster speeds when provided with appropriately designed haptic 
feedback [22]. 

e A similar increase in accuracy has been observed in some prior experiments in 
music technology [10, 45]. 


9.1.2 Additional Force-Feedback Device Designs from the 
Haptics Community 


Outside the realm of computer music, a wide variety of (historically typically very 
expensive) haptic devices have been created and researched. Many of these have 
been used for scientific visualization and/or applications in telerobotic surgery or 
surgical training [12, 16, 29, 35, 38]. The expense of these devices will prevent their 
use from ever trickling down to large numbers of practicing musicians, but they are 
useful for research in haptics. 

For instructional purposes, several universities have made simple haptic force- 
feedback devices that are less expensive. For example, the series of “Haptic Paddles” 
are single degree-of-freedom devices based upon a cable connection to an off-the- 
shelf DC motor [44]. However, such designs tend to be problematic because of 
the unreliable supply of surplus high-performance DC motors [25]. In contrast, the 
iTouch device at the University of Michigan instead contains a voice coil motor, 
which is hand wound by students [25]. However, making a large number of devices 
is time intensive, and the part specifications are not currently available in an open- 
source hardware format. 
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9.1.3 Open-Source Technology for the Design of Haptic 
Musical Instruments 


Force-feedback technologies tend to be rather complex. Consequently, small-scale 
projects have been hampered as the technological necessities have required so much 
attention that little time remained for aesthetic concerns. Furthermore, practical 
knowledge needed for prototyping haptic musical instruments has not been widely 
available, which has made it even more challenging for composers to access the 
technology. 

In response, Berdahl et al. have created an open-source repository,' which contains 
simple examples that provide insight into the design of haptic musical instruments. 
These examples are built upon a series of open-source tools that can be used to rapidly 
prototype new haptic musical instruments. The main projects within the repository 
are the following: 


e The FireFader is an extensible and open-source force-feedback device design based 
on two motorized faders (see Fig.9.2) [6]. Typically, the faders are feedback- 
controlled by a laptop. The faders’ positions are sent to a host computer via a low 
latency USB connection, and in turn force-feedback signals are rapidly sent back 
to the faders. Drivers are provided for controlling the FireFader from Max, Pure 
Data, and Faust. Because the design is based on the Arduino framework, it can 
easily be repurposed into other designs. 

e The Haptic Signal Processing (HSP) objects from 2010 are a series of abstractions 
in Max that enable rapid prototyping of physics-based sound synthesis models [7], 
with an emphasis on pedagogy. Some of the most important abstractions in HSP 
include FireFader~, resonator™~, DWG-end~, mass“, link~.” Notably, 
physics-based models in HSP can be freely intermixed with other Max objects, 
which is useful for studying how physics-based models and traditional signal- 
based models can be mixed. Vibrotactile haptics can also be experimented with in 
HSP simply by connecting audio signals to the FireFader™ object. 

e Synth-A-Modeler [9, 11] is another tool for creating physics-based models. 
Table 9.1 summarizes the Synth-A-Modeler objects referred to in the rest of the 
chapter. Compared with HSP, the models created with Synth-A-Modeler are more 
efficient and can be compiled into a wider variety of target architectures using 
Faust [46]. However, HSP provides a gentler introduction to haptic technology. 


Workshops have been taught at a series of international conferences using the 
repository. 


"https://github.com/eberdahl/Open-Source-Haptics-For-Artists (last accessed on August 16, 
2017). 

?The functionality of Max is extended by abstractions, which are custom-defined objects that 
encapsulate program code. 
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Fig. 9.2 FireFader is a force-feedback device with two motorized faders. It uses open-source 
hardware and is based on the Arduino platform, so it can easily be reconfigured for a wide variety 
of applications 


9.1.4 Laptop Orchestra of Louisiana 


Since its inception, the so-called laptop orchestra has become known as an ensemble 
of musicians performing using laptops. Precisely what qualifies as a laptop orchestra 
is perhaps a matter of debate, but historically they seem to be configured similarly 
to the original Princeton Laptop Orchestra (PLOrk). As described by Dan Trueman 
in 2007, PLOrk was then comprised of fifteen performance stations consisting of 
a laptop, a six-channel hemispherical loudspeaker, a multichannel sound interface, 
a multichannel audio power amplifier, and various additional commercial music 
controllers and custom-made music controllers [51, 52]. 

The Laptop Orchestra of Louisiana (shown in Fig. 9.3) was created in 2011 and 
originally consisted of five performance stations. Since then, it has been expanded to 
include ten performance stations and a server. Organizationally, the ensemble aims 
to follow in the footsteps of PLOrk and the Stanford Laptop Orchestra (SLOrk) by 
leveraging the integrated classroom concept, which encourages students to naturally 
and concurrently learn about music performance, music composition, programming, 
and design [56]. The Laptop Orchestra of Louisiana further serves the local commu- 
nity by performing repertoire written by both local students and faculty [50]. 

As opposed to composing for traditional ensembles, whose formation is usually 
clearly defined, composing for laptop orchestra is generally a very open-ended activ- 
ity. Some authors even consider composing for laptop orchestra to be an ill-defined 
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Table 9.1 Some of the virtual objects implemented by Synth-A-Modeler 


Name Symbol Description 
Waveguide objects 
waveguide —_________ a length of string (i.e. one-dimensional waveguide) 


termination /N endsa waveguide, allowing waves to reflect back 
junction -——__.. connects waveguides and up to one link-like object 
together 


Mass-like objects 


mass © point mass moving in one axis (out of the page) 
ground jee point mass that never moves, like an infinite mass 
m 
| 


port 1 / represents the motion of a single axis of a force- 
feedback device 


resonators o point on an object that resonates at specified 
frequencies 


Link-like objects 


touch — PO} exerts a spring-like force to try to push one object 
“outside of” the other 

pluck plectrum-type interaction with hysteresis (e.g. 
a touch-link that “switches sides”) 


O : . : d 
stiffeninglink f link whose stiffness increases as it is extended 


problem [19]. An informative swath of repertoire now exists for laptop orchestras, 
and other ideas may be drawn from the history of experimental music. Due to its 
open-ended nature, treating the process of composing for laptop orchestra as a design 
activity can be fruitful. Specifically, early prototyping and iteration activities can be 
helpful in providing insight [19]. This kind of thinking is also helpful when designing 
virtual instruments for haptic interaction. The authors are working on this endeavor 
not only by prototyping, iterating, and refining interaction designs into music com- 
positions, but also by expanding and honing the content available in the Open-Source 
Haptics for Artists repository [6, 7, 9, 11]. 

In 2013, students at Louisiana State University built a FireFader for each perfor- 
mance station. A laser-cut enclosure design was also created (see Fig. 9.2) to provide 
performers with a place to rest their hands. Then students and faculty started com- 
posing music for the Laptop Orchestra of Louisiana with FireFaders. This chapter 
reports on some ideas for composing this kind of music, as informed by the outcomes 
of these works. The following specific approaches are suggested: providing perform- 
ers with precise, physically intuitive, and reconfigurable controls, using traditional 
controls alongside force-feedback controls as appropriate, and designing timbres that 
sound uncannily familiar but are nonetheless novel. 
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Fig. 9.3. Laptop Orchestra of Louisiana performing in the Digital Media Center Theater at Louisiana 
State University 


9.2 Enabling Precise and Physically Intuitive Control 
of Sound (“Quartet for Strings”) 


Compared with other electronic controls for musical instruments, such as buttons, 
knobs, sliders, switches, touchscreens, force-feedback devices have the ability to 
provide performers with precise, physically intuitive, and programmable control. 
To achieve this, instruments need to be carefully designed so that they both feel 
good and sound good. It is helpful to carefully match the mechanical impedance of 
the instruments to the device and performers, and it is recommended to apply the 
principle of acoustic viability. 

Demonstrating these characteristics, Quartet for Strings by Stephen David Beck 
is a quartet written for four virtual vibrating strings. Each of these strings is played 
by a single performer using a FireFader as depicted in Fig. 9.4. To match the structure 
of a traditional string quartet, the instruments are similarly scaled to allow differ- 
ent performers to play different pitch ranges. This results in four different virtual 
instrument scales: first violin, second violin, viola, and cello. 


9.2.1 Instrument Design 


9.2.1.1 Acoustic Viability 


Acoustic viability is a digital design principle that recognizes the importance of 
integrating nuance and expressive control into digital instruments, using traditional 
acoustic instruments as inspiration [4, 5]. Traditional acoustic musical instruments 
have been refined over long periods, often spanning performers’ lifetimes, whole 
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Fig. 9.4 Quartet for Strings is for a quartet of FireFaders and laptops, each of which enables a 
performer to play a virtual vibrating string 


centuries, or even longer. Consequently, traditional instruments tend to exhibit com- 
plex mechanics for providing performers with nuanced, precise, expressive, and 
perhaps even intimate control of sound [4]. 

However, these nuanced relationships tend to sometimes be lacking in simple 
signal processing-based or even physics-based synthesizer designs. The reason for 
this is that significant effort is required during synthesizer design in order to afford 
nuance and expressive control. Therefore, for a digital instrument to be acousti- 
cally viable, it has been suggested that the synthesizer designer should implement 
cross-relationships between parameters such as amplitude, pitch, and spectral content 
[4, 5]. For example, designers can consider how changes in amplitude could affect 
the spectral centroid and vice versa [4]. 

With physics-based modeling, such cross-relationships will tend to be clearly 
evident if strong nonlinearities are present in a model. For example, if a lightly 
damped material exhibits a stiffening spring characteristic, then the pitch modulation 
effect will tend to result in these kinds of cross-relationships. This kind of effect can 
be observed in many real chordophones, membranophones, and idiophones [20]. 

Accordingly for Quartet for Strings, it was decided to create a plucked string 
instrument that exhibited tension modulation by interspersing masses (@) with 
stiffeninglink objects ( ) as shown in Fig. 9.5 [8, 20]. As with related force- 
feedback instruments, the right-hand side FireFader knob (") can be used to pluck 
( ) the string (see Fig.9.5, right). However, it was desired to also control 
the pitch of the string using the FireFader. This was achieved by making the string 
very loose or “slack” and then using the left-hand side FireFader knob to simul- 
taneously touch (—}—) all of the string masses. For more information on how 
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Fig. 9.5 String model GooeyStringPitchModBass in Synth-A-Modeler consists of forty 
masses, interconnected by stiffeninglink objects and terminated by ground objects (see 
Table 9.1). The fader knob on the right-hand side is used to pluck one of the masses. The fader knob 
on the left-hand side is used to depress all of the masses simultaneously, which gradually increases 
the pitch 


the stiffeninglink objects are parameterized, the reader is referred to a prior 
publication [8]. A demonstration video helps to illustrate how this instrument lever- 
ages the principle of acoustic viability to realize physically intuitive and expressive 
control.* 


9.2.1.2 Impedance Matching 


Impedance matching is a technique in which the impedances of two interacting 
objects are arranged to be similar to each other. This allows optimal energy exchange 
between them. As explained in Sect.2.2, in the musician—instrument interaction, 
impedance matching ensures effective playability and tight coupling. 

In the model GooeyStringPitchModBass, the weight of the virtual model 
(e.g., the string) needs to be approximately matched to the combined weight of a 
hand holding a fader knob. This is achieved by setting the weight of each virtual 
mass to be 1 g. Since the string is comprised of 40 masses, its total weight is 40 g, 
which is comparable to the combined weight of a hand holding a fader knob. 


9.2.2 Performance Techniques 


Two special performance techniques further exploit the precise and physically intu- 
itive control afforded by the designed instruments. 


9.2.2.1 Pizzicato with Exaggerated Pitch Modulation 
First, a performer can fully depress the string and then quickly release it. Then the 


force feedback rapidly moves the left-hand side fader knob back to a resting position. 
The sound of this technique is reminiscent of a Bartók pizzicato, except that the pitch 


3https://cct.lsu.edu/eberdahl/V/DemoOfASlackString.mov (last accessed on August 16, 2017). 
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descends considerably and rapidly during the attack. In Quartet for Strings, this can 
be heard after the first introduction of the cello instrument. 

It should be noted that this technique can only be used expressively due to the 
virtual nature of the string’s implementation. The authors are not aware of any real 
strings that demonstrate such strong stiffening characteristic, do not break easily, 
and which could be reliably performed without gradual detuning of the pitch that the 
string tends toward upon release. 


9.2.2.2 Force-Feedback Jeté 


A second special technique emerges when a performer lightly depresses the left- 
hand side knob to lightly make contact with the virtual string. The model responds 
accordingly with force feedback to push the knob in the opposite direction (against 
the performer’s finger). When the pressure the performer exerts and the response the 
model synthesizes are balanced in a particular proportion, the fader and instrument 
become locked together in a controlled oscillation. This oscillation can be precisely 
controlled through the physically intuitive connection with the performer. This tech- 
nique is used extensively near the end of the piece. On the score, this technique is 
indicated using the marking jeté, giving a nod to the violin technique with the same 
name. 


9.2.3 Compositional Structure 


Quartet for Strings is composed as a modular piece with three-line staves representing 
relative pitch elements (see Fig. 9.6). While precision of time and pitch is not critical 
to its performance, the piece was conceived as a composed, and not as an improvised 
work. It balances control over gesture and density with aleatoric arrangements of the 
parts. 

In the sense that the score invites performers with less extensive performance 
experience to try to perform as expressively as possible, the authors believe that 
the score is highly effective in the context of a laptop orchestra. The score provides 
expressive markings to encourage the performers to try to fully leverage the acousti- 
cally viable quality of the instruments. At the same time, the score allows for some 
imprecision of the timing and pitches, freeing the performers from limiting their 
performance through precisely attending to strict performance requirements. 

A studio video recording of Quartet for Strings is available for viewing at the 
project Web site, which demonstrates how the force feedback facilitates precise and 
physically intuitive control.* 


4https://www.youtube.com/watch?v=l-29Xetel KM (last accessed on August 16, 2017). 
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Fig. 9.6 Excerpt from Quartet for Strings 


9.3 Traditional Controls Can Be Used Alongside 
Force-Feedback Controls (“Of Grating 
Impermanence”) 


Different kinds of controls provide different affordances. In the context of laptop 
orchestra, where a variety of controls are available (such as trackpads, computer 
keyboards, MIDI keyboards, or even drum pads, tablets [51]), traditional controls 
can be used appropriately alongside force-feedback controls. For example, to help 
manage mental workload [41], buttons or keys can be used to change modes while 
force-feedback controls enable continuous manipulation of sound. 

This approach is applied in Of grating impermanence by Andrew Pfalz. For this 
composition, each of the four performers plays a virtual harp with twenty strings 
(see Fig. 9.7), which can be strummed using a FireFader knob. As with Quartet for 
Strings, the performance of subtle gestures is facilitated by the force feedback coming 
from the device. The musical gestures are intuitive, comfortable, and feel natural to 
execute on the instruments. 


9.3.1 Instrument Design 


The harp model incorporates both continuous control (via the faders) and discrete 
control (via the laptop keyboard). Due to this combination, performers can focus 
on dexterously making continuous musical gestures with the FireFader, while easily 
stepping through harp tunings using simple button presses. Specifically, the model 
shown in Fig. 9.7 is controlled as follows: 


e The first FireFader knob enables performers to strum across twenty evenly spaced 
strings, each of which provides force feedback. 

e The second FireFader knob does not provide force feedback—instead, it enables 
rapid and precise control of the timbre of the strings. As the performer moves 
this knob from one extreme to another, the timbre of the strings goes from being 
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Fig. 9.7 For Of grating impermanence, the harp model PluckHarp20 includes twenty strings 
that can be plucked using a single FireFader knob. Each of these strings is created by connecting a 
termination to a waveguide to a junction to a touch link to a second waveguide to 
a second termination (for more details, see Table 9.1) 


dark and short, like a palm-muted guitar, to bright and resonant, like guitar strings 
plucked near their terminations. 

e The right and left arrow keys of the laptop keyboards enable the performer to 
step forward or backward, respectively, through preprogrammed tunings for each 
of their twenty virtual strings. Consequently, the performers do not need to be 
continuously considering the precise tuning of the strings. 
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9.3.2 Performance Techniques 


9.3.2.1 Simultaneously Changing the Chord and Strumming 


With training, the performers gravitate toward a particular performance technique, 
especially in sections of the composition with numerous chord changes. In these 
sections, the performers learn to use the following procedure: (1) wait for notes 
to decay, (2) use the arrow key to advance the harp’s tuning to the next chord, 
(3) immediately strum the virtual strings using the FireFader, and (4) repeat. The 
ergonomics of this performance technique are illustrated in Fig.9.8, which shows 
how each performer’s right hand is operating a FireFader, while the left hand is 
operating the arrow keys (shown boxed in yellow in Fig. 9.8). 

Visual feedback is further employed to help the performers stay on track. The index 
of each chord is displayed on the laptop screen in a large font, so that performers can 
error check their progress in advancing through the score. 


9.3.2.2 Accelerating Strums 


Preprogramming the note changes for banks of twenty plucked strings also enables a 
specialized strumming technique. Since each performer is passing the fader knob over 


Fig. 9.8 For Of grating impermanence, the performers use their right hands to pluck a harp of 
virtual strings and their left hands to press the arrow keys on the laptop keyboard (see the yellow 
rectangles above). The right arrow advances to the next chord for the harp, and the left arrow goes 
back to the previous chord 
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so many strings, it is possible for the performer to noticeably accelerate or decelerate 
during a single strumming gesture. This technique aids in building tension during 
the first section of the composition. The authors would like to note that, although no 
formal tests have been conducted, they have the impression that the force feedback 
is crucial for this performance technique, as it makes it possible to not only hear but 
also feel each of the individual strings. 


9.3.2.3 Continuous Control of Timbre for Strumming 


The second knob on each FireFader enables the performers to occasionally but imme- 
diately alter the timbre of the strings as indicated in the score. Since this technique 
is used sparingly, it has a stark influence upon the overall sound, but it is a powerful 
control that makes the instrument almost seem more lifelike. An additional distortion 
effect further influences the timbre of the strings, and this distortion is enabled and 
disabled by the arrow keys so as to match the printed score. 


9.3.3 Compositional Structure 


Of grating impermanence is performed from a fixed score. The composition com- 
prises several sections that demonstrate various performance techniques of the instru- 
ment. The score shows the notes that are heard, but each performer needs only choose 
where he or she is in the score, not to actually select notes as they would on a tradi- 
tional instrument. In this way, the job of the performer is similar to that of a member 
of a bell choir: following along in the score and playing notes at the appropriate 
times. 

The beginning and ending sections of the composition are texturally dense and 
somewhat freer. The gestures and timings are indicated, but the precise rhythms are 
not notated. The interior sections are metered and fully notated. Stylistically, these 
sections range from monophony to interlocking textures to fast unison passages. 

A studio video recording is available for viewing at the project Web site, which 
illustrates how these performance techniques are enabled by combining traditional 
controls and force-feedback controls.° 


9.4 Finding Timbres that Sound Uncannily Familiar 
but Are Nonetheless Novel (“Guest Dimensions”) 


When composing electroacoustic music, it can generally be useful to compose new 
timbres, which can help give listeners new listening experiences. In contrast, if 


Shttps://www.youtube.com/watch?v=NcxO1ChLcr0 (last accessed on August 16, 2017). 
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timbres sound familiar to a listener, they can beneficially provide “something to 
hold on to” for less experienced listeners [34], particularly when pitch and rhythm 
are not employed traditionally. In the present chapter, it is therefore suggested that 
finding timbres that sound uncannily familiar but are nonetheless novel can help 
bridge these two extremes [13, 18]. 

Guest Dimensions by Michael Blandino is a quartet that explores this concept, 
extending it by making analyzed timbres tangible using haptic technology. For exam- 
ple, each of the four performers uses a FireFader to pluck one of two virtual resonator 
models (see Fig. 9.9), whose original parameters are determined to match the timbre 
of prerecorded percussion sound samples. 


9.4.1 Instrument Design 


9.4.1.1 Calibrating the Timbre of Virtual Models to Sound Samples 


Two virtual resonator physical models were calibrated through modal decomposition 
of sound files of a struck granite block and of a gayageum, which is a Korean plucked 
string instrument [27, 30, 53]. This provided a large parameter set to use for starting 
the instrument design process. 


9.4.1.2 Scaling Model Parameters to Discover Novel Timbres 


Then, for each part and section of the composition, multiple model parameters were 
scaled with respect to the original estimated fundamental frequency, the original esti- 
mated decay times, reference mass values, pluck interaction stiffness, pluck interac- 
tion damping parameter, and virtual excitation location. It was discovered that even 
with the granite block, which did not have a harmonic tone, melodies could nonethe- 
less be realized by scaling the modal frequencies over the range of a few octaves. 
This same approach was used to enable melodies to be played with the gayageum 
model. 

Although performance techniques affected the timbre, the timbre could be more 
strongly adjusted via the model parameters. For example, to increase overall timbral 
interest and to increase sustain of the resonances, the decay times for the struck gran- 
ite block sound were lengthened significantly, enhancing the resonance of the model. 
Further adjustment of the virtual excitation location and scaling of the virtual dimen- 
sions allowed for additional accentuation of shimmering and certain initial transient 
qualities. Similarly, the gayageum model’s decay time was slightly extended, and its 
virtual excitation position was tuned for desired effects. 

This exploration of uncannily familiar yet novel timbres is evident when listening 
to the video recording of Guest Dimensions on the project Web site. The reader 


Shitps://www.youtube.com/watch?v=SrlZ_RUXybe (last accessed on August 16, 2017). 
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Fig. 9.9 For Guest 
Dimensions, the general 
modal synthesis model 
incorporates a 
resonators object that is 
plucked using a single 
FireFader knob (see 

Table 9.1) 


should keep in mind that the range of somehow familiar timbres realized during 
the performance stems from the two originally calibrated models of a struck granite 
block and a plucked gayageum. 


9.4.1.3 Visual Display of the Force-Feedback Interaction 


The FireFaders are not marked to indicate where the center points of the sliders 
are, which corresponds to where the resonators were located in virtual space. Since 
Guest Dimensions calls for specific rhythms to be played, it was necessary to create 
a very simple visual display enabling the performers to see what they were doing. 
The display showed the position of the fader knob and the position of the virtual 
resonator that the fader knob was plucking. The authors have the impression that this 
display may have made it easier for the performers to play more precisely in time. 
Overall, the need for implementing visual displays for some music compositions is 
emphasized by the discussion in Sect. 9.1.1—generally speaking, the implementation 
of additional feedback modalities has the potential to enable more precise control. 
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9.4.2 Performance Techniques 


Two plucking performance techniques in Guest Dimensions are particularly notable. 
Of particular note is that these performance techniques are facilitated by the pro- 
grammable nature of the force feedback. This enables the virtual model to be 
differently impedance matched when different performance techniques are being 
employed. For example, the tremolo performance technique is enhanced through 
a decreased virtual plectrum stiffness, while the legato performance technique is 
enhanced through a moderately increased virtual plectrum stiffness. 


9.4.2.1 Tremolo 


In the first section of the composition, the stiffness of the pluck link (see Fig. 9.9 and 
Table 9.1) in the model is set to be relatively low. This haptic quality enables the per- 
formers to particularly rapidly pluck back and forth across the virtual resonators 
object, obtaining a tremolo effect. Especially rapid plucking results in a louder sound, 
while slower plucking results in a quieter sound. According to the indications in the 
score of Guest Dimensions, the performers use the tremolo technique to create a 
range of dynamics. 


9.4.2.2 Legato 


In the sections not involving tremolo, the performers are mostly plucking more vig- 
orously in a style that could be called legato. In those sections, the performers are 
playing various, interrelated note sequences. Instead of providing the performers with 
manual control over changing the notes (as with Of grating impermanence), it was 
decided that it would be more practical to automate the selection of all of the notes. 
Accordingly, the following approach was used to trigger note updates: right before 
one of the models is plucked, in other words right as the fader knob is approaching 
the center point for the plectrum, the next corresponding fundamental frequency is 
read out of a table and used to rapidly scale the fundamental frequency of the model. 
Careful adjustment of the threshold point is needed to avoid pitch changes during 
the resonance of prior attacks or changes after new attacks. Performers develop 
an intuition for avoiding false threshold detection through confident plucking. 
An advantage of this approach is that performers do not need to manually advance 
the notes; however, a performer without adequate practice may occasionally advance 
one note too many, and in this case, the performer will require a moment of tacit to 
recover. 
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9.4.3 Compositional Structure 


As with Of grating impermanence, Guest Dimensions is performed from a fixed 
score. Performers play in precise time according to a pre-written score, sometimes in 
homorhythm. Each part for each section utilizes one of the two models, but adjust- 
ments of the models are unique to the sections of each part. Melodic themes in 
counterpoint are performed with the gayageum, which are accompanied by the dec- 
orative chimes of the granite block model. Extended percussive sections feature the 
granite block model in strict meter, save for a brief passage in which the performers 
are free to separately overlap in interpretive gestures. 


9.5 Conclusions 


A case study was presented demonstrating some ways that force-feedback DMIs 
could be integrated into laptop orchestra practice. The contributing composers real- 
ized a variety of compositional structures, but more commonalities were found in the 
successful instrument design approaches that were applied. Accordingly, the authors 
suggest that composers working in this field should consider the following: (1) pro- 
viding performers with precise, physically intuitive, and reconfigurable controls, (2) 
using traditional controls alongside force-feedback controls as appropriate, and (3) 
designing timbres that sound uncannily familiar but are nonetheless novel. Music 
performance techniques were enabled that more closely resembled some traditional 
music performance techniques, which are less commonly observed in laptop orches- 
tra practice. 
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Chapter 10 A) 
Design of Vibrotactile Feedback ciecie; 
and Stimulation for Music Performance 


Marcello Giordano, John Sullivan and Marcelo M. Wanderley 


Abstract Haptics, and specifically vibrotactile-augmented interfaces, have been the 
object of much research in the music technology domain: In the last few decades, 
many musical haptic interfaces have been designed and used to teach, perform, 
and compose music. The investigation of the design of meaningful ways to convey 
musical information via the sense of touch is a paramount step toward achieving truly 
transparent haptic-augmented interfaces for music performance and practice, and in 
this chapter we present our recent work in this context. We start by defining a model 
for haptic-augmented interfaces for music, and a taxonomy of vibrotactile feedback 
and stimulation, which we use to categorize a brief literature review on the topic. We 
then present the design and evaluation of a haptic language of cues in the form of 
tactile icons delivered via vibrotactile-equipped wearable garments. This language 
constitutes the base of a “wearable score” used in music performance and practice. 
We provide design guidelines for our tactile icons and user-based evaluations to 
assess their effectiveness in delivering musical information and report on the system’s 
implementation in a live musical performance. 


10.1 Introduction 


In recent years, the widespread availability of smartphones and tablet computers 
made vibrotactile technology—in the form of actuators specifically designed to stim- 
ulate a user’s sense of touch via vibration—inexpensive and readily available. Haptic 
researchers, both in academic and industrial contexts, have been designing ways of 
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communicating via the sense of touch by means of tactile effects used to provide infor- 
mation such as: navigational cues [50], textures [30], or notifications [44]. Systematic 
studies have been conducted to assess the efficiency of these effects in well-defined 
contexts, and new prototypes and applications are constantly being investigated. 

In the music domain, the sense of touch can be used to convey relevant musical 
information, such as articulation [43] and timing [51], especially in professional 
performances [29]. Several haptic interfaces for music performance and practice 
have been created in the last two decades, but for very few of these a thorough 
evaluation of their effectiveness has been conducted. 

In this chapter, we present our work in the development and preliminary eval- 
uation of meaningful ways to provide information to performers via the sense of 
touch for music performance and practice. Our research, conducted in the context 
of a multidisciplinary project involving haptic researchers, composers, and wearable 
designers, is aimed at the development of a language of tactile icons specifically 
designed to convey musical information to professional musicians. These icons, 
delivered via specialized garments equipped with arrays of vibrotactile actuators, 
have been evaluated to determine their effectiveness and reliability. They will be 
used as the building blocks of a wearable score language, which composers will use 
to create new pieces and art installations. 

To provide a theoretical framework for this research, we present a brief overview 
of the current state of haptic feedback and stimulation in music technology. We 
expand the classical models of digital musical instruments (DMIs) [39] to include 
general-purpose tactile interfaces, i.e., devices where other sensory feedback may 
not be present and tactile feedback can be arbitrary mapped to external sources of 
information. We then present a literature review together with a taxonomy of tactile 
feedback and stimulation. This categorization is aimed at emphasizing the different 
functional roles that haptic technology can achieve in conveying musically relevant 
information. 


10.2 Haptic Feedback in Music Technology 


Haptic technology has been widely used in the development of interfaces for musical 
expression and musical interaction, and two main classes of devices can be identified 
in this context: DMIs and general-purpose haptic interfaces. 

In traditional musical instruments, the tactile and kinaesthetic feedback coming 
from the resonating parts of the instrument give the performer important informa- 
tion about their interaction [1, 20, 28, 43] (see Chap. 2). In DMIs, the decoupling 
of gesture acquisition from sound synthesis has the important effect of breaking the 
mechanical feedback loop between performer and sound-producing structures. Hap- 
tic feedback becomes then an arbitrary design factor [31], and the choice of actuators 
and signals used to drive them (see Sect. 13.2) defines the instrument’s architecture. 

Haptic devices can provide tactile cues during performance with DMIs, not only 
if embedded into the instruments themselves, but also when deployed separately 
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by means of tactile displays and wearable devices that can be used to go beyond 
the direct performer—instrument interaction. In the context of music performance, 
these devices, which we refer to as general-purpose haptic interfaces, can convey 
information about performers’ interactions with a live-electronics system [37] or as 
learning tools to direct and guide users’ gestures via vibrotactile feedback [49] (see 
also Chap. 11). They can also be used to convey score cues to a performer on stage [45] 
by means of abstract languages of tactile icons [33]. In this context, the distinction 
between feedback and stimulation becomes clear: The former is a direct response 
of the instrument or the general-purpose interface to a user’s action; the latter is not 
issued from a player—device interaction, but it is a means of communication with the 
user, mediated by the tactile actuators in the interface, which can be used to convey 
any sort of information. 

These displays usually provide either localized (i.e., single body site) or distributed 
vibrations (via actuators placed on multiple body sites), requiring the design of tactile 
effects more centered on temporal or spatial properties, respectively, or a combination 
of both. 


10.2.1 Models of Haptic-Enabled Interfaces 


The relationship between performer, haptic-enabled musical interface (either general- 
purpose device or DMI), and audience can be complex, and a number of abstract 
models of the interaction between these components can be found in the literature. 
In the case of DMIs several models exist, each of which emphasizes different aspects 
of the instrument’s design. Marshall [34] reviews four of these models [4, 5, 9, 54] 
and proposes a hybrid model merging characteristics across them. 

In Fig. 10.1, we present an extension of this model, which is a representation of 
the interaction with either haptic-enabled DMIs or general-purpose devices. While 
the former can provide the performer with both kinaesthetic or tactile feedback, the 
latter are usually implemented as vibrotactile displays, for reasons that are mainly to 
be found in current technology limitations.'! As mentioned above, the haptic channel 
does not need to be limited to the display of feedback issued as a direct response to 
performers’ actions, but can be mapped arbitrarily to convey information from exter- 
nal sources such as environmental variables or score parameters. This is represented 
by the external information source in our model. 


'We refer here to the case of general-purpose interfaces developed for musical applications. These 
displays are generally conceived as portable/wearable devices to be used by musicians either prac- 
ticing or performing on stage. Kinaesthetic devices, on the other hand, are generally much larger 
in scale and are hence difficult to integrate into the design of a portable, general-purpose musical 
interface. 
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Performer 


Fig. 10.1 Model of a haptic DMI and general-purpose haptic device. In both devices, a haptic 
generator is used to produce haptic feedback and stimulation, which is issued from mapping of 
sensor data or external information. The simultaneous use of both types of devices is also possible, 
and sensor data from either device could be mapped to the haptic generator of the other 
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10.2.2 Haptic-Enabled Interfaces 


Haptic-enabled interfaces for music performance can be categorized according to the 
way they deliver haptic feedback and stimulation to the final users. Both DMIs and 
general-purpose devices can address either the kinaesthetic or the tactile modality, 
and this can be done in an active or a passive way [5]: Passive feedback and stimu- 
lation come from the inherent physical properties of the interface and are not issued 
by the system’s haptic generator; active interfaces implement a haptic generator to 
provide user with the designed kinaesthetic and tactile effects. 

We will present some of the most important devices present in the literature 
following these two categories and provide a threefold taxonomy for the active tactile 
case. 


10.2.2.1 Passive Kinaesthetic Feedback 


Passive kinaesthetic feedback and stimulation are inherent to the physical character- 
istics of the controller, and do not require any externally synthesized signal. 
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O’Modhrain and Essl developed three DMIs that implement passive kinaesthetic 
feedback. The Pebble Box and the Crumble Bag [41] were used to control an event- 
based granular synthesizer: the Pebble Box consists of a box filled with different-sized 
pebble stones and a microphone that picks up the noise produced by the collisions 
between pebbles. The kinaesthetic feedback offered by the interface comes from the 
physical properties of the pebbles themselves, and the impact sounds act as triggering 
events on the granular synthesizer. The Crumble Bag follows the same patter, and 
it is aimed to take advantage of natural “grabbing gestures.” A fabric bag is filled 
with different materials that provide haptic feedback, and a small microphone in the 
bag provides the necessary event triggers to the algorithm. The Scrubber [14] also 
implemented the same approach: an eraser embedded with a force sensor and two 
microphones were used to control the synthesis of friction sounds, synthesized by 
means of granular or wavetable synthesis. The haptic feedback again was directly 
issued by the manipulation of the device dragged along a surface. 

Sinyor and Wanderley [47] developed the Gyroyre, a handheld controller based on 
a spinning wheel, in which the kinaesthetic feedback comes directly from the dynamic 
properties of the system. The mapping and synthesis algorithm are designed to take 
advantage of the haptic feedback, and the interface can be used for different musical 
applications, sequencing or modifying effects’ parameters. 


10.2.2.2 Active Kinaesthetic Interfaces 


Active kinaesthetic feedback is the response of the controller to the user’s actions, 
usually by means of synthesized signals supplied into motors or actuators, which 
stimulate kinaesthetic receptors. This is most commonly referred to as force feedback. 

The earliest example of a force-feedback device specifically developed for musical 
applications is probably the Transducteur Gestuel Rétroactif (TGR) developed at 
ACROE, whose development is described in Sect.8.3. This device was recently 
used by Sinclair et al. [46] to investigate velocity estimation methods in the haptic 
rendering of a bowed string. 

Another classical example is the Moose, developed by O’Modhrain and 
Gillespie [42], consisting of a plastic puck that the user can manipulate in a 2D space, 
which is attached to flexible metal bars, connected to linear motors. Two encoders 
sense the movements of the puck, and the motors provide the correspondent force 
feedback. The device was used in a bowing test, using a virtual string programmed 
in Synthesis ToolKit (STK) [10], where the presence of friction between the bow 
and the string was simulated using the haptic device. 

The vBow by Nichols [40] is a violin-like controller that uses a series of servo- 
motors and encoders to sense the movement of a rod, acting as the bow, connected 
to a metallic cable. In its last incarnation, the vBow is capable of sensing moment in 
4-DoF and producing haptic feedback accordingly. 

More recently, Berdahl and Kontogeorgakopoulos [2] developed the FireFader, 
a motorized faders using sensors and DC motors to introduce musicians to haptic 
controllers. Both the software and hardware used for the project are open-source, 


198 M. Giordano et al. 


allowing musicians to customize the mapping of the interface to their specific needs. 
Applications of the device are described in Chap. 9. 


10.2.2.3 Passive Tactile Interfaces 


Passive tactile is a form of primary feedback, which leverages the use of different 
types of materials in a controller for musical expression. The properties of these 
materials (e.g., stiffness, flexibility) can affect the ergonomics of the instrument and 
its feel in the user’s hands. 

As an example, the Meta-Instrument [11] has the form of a partial exoskeleton 
embedded with buttons that the performer uses to trigger samples and events in the 
sound; the performer’s gestures are captured via sensors in the arms and mapped 
to various effects. The buttons embedded in the controller are covered in a layer of 
foam, providing the user with immediate passive feedback about the level of pressure 
applied. 


10.2.2.4 Active Tactile Feedback and Stimulation: A Taxonomy 
for Musical Interaction 


Active tactile feedback and stimulation are the main focus of this chapter, and for 
this reason we provide a more in-depth analysis of the related literature, as well 
as an updated taxonomy, based on Giordano and Wanderley [19], which will help 
categorize examples in this field. 

We propose a classification identifying in active tactile feedback and stimulation 
three different categories according to the function that the tactile effects have in the 
interface design: tactile notification, tactile translation, and tactile languages. 


Tactile Notification 


The most straightforward application of tactile stimulation is intended for notifying 
the users about events taking place in the surrounding environment or about results of 
their interaction with a system. The effects designed for this kind of applications can 
be as simple as single, supra-threshold stimuli” aimed at directing users’ attention, 
but they can also be more complex, implementing temporal envelopes and/or spatial 
patterns. 

Michailidis and Berweck [37] and Michailidis and Bullock [38] have explored 
solutions to provide haptic feedback in live-electronics performance. The authors 
developed the Tactile Feedback Tool, a general-purpose interface using small vibrat- 
ing motors embedded in a glove. The interface gave musicians information about the 
successful triggering of effects in a live-electronics performance, using an augmented 
trumpet or a foot pedal switch. This device leverages the capacity of the tactile sense 
to attract users’ attention, while not requiring them to lose focus on other modalities, 
which would have been the case with the use of onstage visual displays. 


?Stimuli whose intensity exceeds vibrotactile thresholds and are thus perceivable (see Sect. 4.2). 
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Van der Linden et al. [49] implemented a whole-body general-purpose vibro- 
tactile device. The authors used a motion capture system and a suit embedded with 
vibrating motors distributed over the body to enhance the learning process of bowing 
for novice violin players. A set of ideal bowing trajectories was computed using the 
motion capture system; when practicing, the players’ postures would be compared 
in real time with the predefined ideal trajectories. If the distance between any two 
corresponding points in the two trajectories exceeded the threshold value, the motor 
spatially closer to that point would vibrate, notifying the users to correct their pos- 
ture. The authors conducted a study in which several players used the suit during 
their violin lessons. Results showed an improved coordination of the bowing arm, 
and participants reported an enhancement in their body awareness produced by the 
feedback. 

A similar solution was developed by Grosshauser and Hermann [21], which 
used a vibrating actuator embedded in a violin bow to correct hand posture. Using 
accelerometers and gyroscopes, the position of the bow could be compared in real 
time to a given trajectory, and the tactile feedback would automatically activate to 
notify the users about their wrong posture. 


Tactile Notification 


With tactile translation, we refer to two separate classes of applications: One class 
implements sensory substitution techniques to convey to the sense of touch stimuli 
which would normally be addressed to other modalities; the other class simulates 
the haptic behavior of other structures whose vibrational properties have previously 
been characterized. 


Sensory Substitution 


The field of sensory substitution has been thoroughly investigated since the begin- 
ning of the last century. In 1930, von Békésy started investigating the physiology 
behind tactile perception by drawing a parallel between the tactile and the auditory 
channels in terms of the mechanism governing the two perception mechanisms [53]. 
A thorough review of sensory substitution applications can be found in Visell [52]. In 
a musical context, several interfaces have been produced with the aim of translating 
sound into perceivable vibrations delivered via vibrotactile displays. Crossmodal 
mapping techniques can be utilized to perform the translation, identifying sound 
descriptors to be mapped to properties of vibrotactile feedback. 

Karam et al. [27] developed a general-purpose interface in the form of an aug- 
mented chair (the Emoti-Chair) embedded with an array of eight speakers disposed 
along the back. The authors’ aim was to create a display for deaf people to enjoy 
music through vibrations. They developed the Model Human Cochlea [26]—a sen- 
sory substitution model of the cochlear critical band filter on the back—and mapped 
different frequency bands of a musical track, rescaled to fit into the frequency range 
of sensitivity of the skin (see Sect.4.2), to each of the speakers on the chair. In a 
related study, Egloff et al. [12] investigated people’s ability to differentiate between 
musical intervals delivered via the haptic channel, finding that on the average smallest 
perceptible difference was a major second (i.e., two semitones). It was also noted that 
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results vary widely due to the sensitivity levels of different receptive fields across 
the human body. Thus, care must be taken when designing vibrotactile interfaces 
intended to be used as a means for sensory substitution. 

Merchel et al. [36] developed a prototype mixer equipped with a tactile translation 
system to be used by sound recording technicians. A mixer augmented with an 
actuator would allow the user to recognize the instrument playing in the selected 
track only by means of tactile stimulation: A tactile preview mode would be enabled 
on the mixer, performing a real-time translation of the incoming audio. Preliminary 
results show that users were able to recognize different instruments only via the 
sense of touch; better performance was obtained for instruments producing very 
low-frequency vibrations (bass) or strong rhythmical patterns (drums). A similar 
touch screen-based system and related test applications are described in Chap. 12. 


Tactile Stimulation 


In tactile stimulation applications, the vibrational behavior of a vibrating structure 
is characterized and modeled so as to be able to reproduce it in another interface. 
Examples in this category include physical modeling of the vibrating behavior of a 
musical instrument, displayed by means of actuators. 

A DMI featuring tactile stimulation capability is the Viblotar by Marshall [35]. 
The instrument is composed of along, narrow wooden box equipped with sensors and 
embedded speakers. Sound is generated from a hybrid physical model of an electric 
guitar and a flute programmed in the Max/MSP environment. During performance, 
the instrument rests on the performer’s lap or on a stand. One hand manipulates a 
long linear position sensor and matching force sensitive resistor (FSR) underneath to 
“pluck” a virtual string. The location, force, and speed of the motion are mapped to 
frequency, amplitude, and timbre parameters of the physical model. The other hand 
operates two small FSRs which control pitch bend up and down. The sound output 
from the Viblotar can be redirected to external speakers, hence allowing the embedded 
speakers to function primarily for generating vibrotactile feedback instead of sound 
output. In this configuration, the sound output is split, with one signal sent directly 
to the external speakers and another routed through a signal processing module that 
can produce a variety of customized vibrotactile effects such as compensating for 
frequency response of loudspeakers, simulating the frequency response of another 
instrument or amplifying the frequency band to which the skin is most sensitive [34]. 


Tactile Languages 


Tactile languages are an attempt to create compositional languages solely addressed 
to the sense of touch, in which tactile effects are not just simple notifications, issued 
from the interaction with a system, but can be units or icons for abstract communi- 
cation mediated by the skin. 

An early example of tactile language is the “vibratese,” proposed by Geldard [16], 
who aimed at creating a complete new form of tactile communication delivered by 
voice coil actuators (see Sect. 13.2). Parameters for defining building blocks for the 
language would be elements such as frequency, intensity, and waveform. A total of 45 
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unit blocks representing numbers and letters of the English alphabet were produced, 
allowing for expert users to read at rates up to 60 words per minute. 

More recently, much research on tactile languages has been directed toward the 
development of tactile icons. Brewster and Brown [6] introduced the notion of tac- 
tons, i.e., tactile icons to be used to convey non-visual information by means of 
abstract or meaningful associations, which have been used to convey information 
about interaction with mobile phones [8]. Enriquez and MacLean [13] studied the 
learnability of tactile icons delivered to the fingertips by means of voice coil-like 
actuators. By modulating frequency, amplitude and rhythm of the vibration, they 
produced a set of 20 icons, which were tested in a user-based study organized in two 
sessions, two-weeks apart. Participants recognition rates reached 80% in the first 
session after 10min of familiarization with the system and more than 90% during 
the second session. 

In a musical context, attempts to create compositional languages for the sense of 
touch can be found in the literature. Gunther [22] developed the Skinscape system, a 
tactile compositional language whose building blocks varied in frequency, intensity, 
envelope, spectral content of vibrations, and spatial position on the body of the 
user. The language was at the base of the Cutaneous Grooves project by Gunther and 
O’ Modhrain [23], in which it was used to compose a musical piece to be accompanied 
by vibrations delivered by a custom-built set of suits embedded with various kinds 
of actuators. 

In terms of tactons, we are not aware of any study evaluating their effectiveness 
in the context of music performance and practice. This is the object of the remainder 
of this chapter, where we present the design and evaluation of tactile icons for expert 
musicians. 


10.3 Development and Evaluation of Tactile Icons 
for Music Performance 


Our focus in this section will be on the development of a tactile language and its 
application in designing a language of vibrotactile cues to be used by musicians. We 
present the design process behind the tactons we developed, and present a methodol- 
ogy for evaluating their effectiveness when delivered via tactile-augmented garments. 
Our work was conducted in the context of Musicking the Body Electric, a four-year 
(2014-2018) multidisciplinary project involving researchers from the fields of hap- 
tics, music technology, music education, composition, and wearable electronics.* 
The ultimate goal of the project is to develop tactile-augmented suits and a lan- 
guage of tactons [7] to be used as building blocks for a wearable score system. The 
language will allow composers to convey musical information via tactile stimulation 


3Principal investigators: Sandeep Bhagwati (Matralab, Concodia University, Montreal), Marcelo 
M. Wanderley (McGill University, Montreal), Isabelle Cossette (MPBL, McGill Univesrity), Joanna 
Berzowska (XS Labs, Concordia University); funded by the Social Sciences and Humanities 
Research Council of Canada. 
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in the context of a music performance in which musicians are free to walk in the 
performance space. The augmented garments will be able to sense the location of 
the musicians in the performance space and also the position of musicians relative 
to one another. This, for instance, would allow each of the suits to be aware of the 
proximity of other musicians in the room and cue them to play a given section of the 
piece by delivering the corresponding tactile icon. 


10.3.1 Hardware and Software 


The work we present is the result of the first tests conducted on two specialized 
garments produced for the project: an augmented belt embedded with six vibrating 
actuators and an elastic band embedded with a single actuator that could be worn 
around an arm or leg. These garments were developed taking advantage of the hard- 
ware and software we contributed to create for Ilinx, a multisensory art installation 
featuring a whole-body suit embedded with vibrating actuators [18]. 

The garments created for Ilinx feature a custom-designed Arduino-compatible 
board embedded with motor drivers and a Serial Peripheral Interface (SPI) bus. Each 
board can control up to six actuators independently and is connected to a BeagleBone 
Black (BBB)* minicomputer via an Ethernet to SPI adapter. The BBB implements an 
Open Sound Control (OSC) parser which receives control commands from a Max- 
based synthesizer via a wireless network, and dispatches the message to the correct 
board and actuator via SPI. 

Solarbotics VPM2° actuators were used for the garments. This ERM type (see 
Sect. 13.2) of actuator was chosen for its ready availability, low cost, and simple 
design and had previously been characterized for both their physical and perceptual 
properties [15]. 

The wearable designers involved in the project (Joanna Berzowska and Alex 
Bachmayer, XS Labs, Concordia University) produced the first specialized gar- 
ment for us to test: a tactile-augmented belt with six equally spaced ERM actuators 
(Fig. 10.2). The choice of a belt as the first garment to be designed was guided by 
several reasons: The placement of the actuators on a circle around the user’s waist 
allowed for more flexibility in terms of tactile effects design; more practically, a belt 
provides an easier fit compared to leggings or sleeves, for instance [48, 50]. 

A second garment was also introduced, consisting of a single actuator mounted 
on an adjustable band made of stretchable fabric, which could be easily worn on 
body parts such as wrist, upper arm, or ankle. 


4+https://beagleboard.org/black (last accessed on December 17, 2017). 
Shttps://solarbotics.com/product/vpm2/ (last accessed on December 17, 2017). 
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Fig. 10.2 Augmented belt embedded with six vibrating actuators (garment design and 
manufacturing by J. Berzowska and A. Bachmayer—XS Labs, Concordia University) 


10.3.2 Symbolic and Musical Tactons: Design 
and Evaluation 


In the early phase of the project, our approach consisted in designing two sets of 
tactons, to be reproduced, respectively, by the belt and the band. The former would 
be used to convey symbolic tactons, i.e., abstract patterns that musicians would need 
to learn and associate with specific musical elements, for instance sections of a score, 
chords. The latter would deliver instead musical tactons, i.e., tactons which carry a 
unique musical meaning, attached to the temporal properties of the tacton itself. 


10.3.2.1 Symbolic Tactons 


We identified three different dimensions defining the tacton design space associated 
with the six-actuator belt: 


e A spatial dimension, associated with the definition of geometrical patterns on the 
hexagon schematizing the disposition of the six actuators around the waist (see 
Fig. 10.3); 

e A global temporal dimension. Once the geometrical pattern of the tacton has been 
defined, the temporal order or sequence in which the actuators are activated can 
shape the global perception of the tactile effect; 

e An individual temporal dimension, which pertains to the properties of the envelope 
of the vibrotactile signal for each individual actuator. 


For the design of the symbolic tactons, we applied a heuristic approach: We defined 
several geometric patterns which we hypothesized would feature unique character- 
istics, making them easily distinguishable from one another; we then implemented 
these patterns, together with preliminary global and individual temporal properties, 
on a Max-based tactile sequencer we programmed to control the belt; a music ped- 
agogy doctoral researcher (Audrey-Kristel Barbeau) would then test the icons and 
provide immediate feedback to allow us to proceed to another iteration of the design 
process. 
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Fig. 10.3 Final set of ten symbolic icons developed for the belt (diagram courtesy of A.-K. Barbeau). 
Each black dot represents one actuator. The hexagon shapes represent the actuators disposed around 
a user’s waist, with the top two actuators corresponding to the person’s front. Icons 1—4 feature a 
sequence of actuations which follow the direction indicated by the arrows. For icons 5-10, connected 
dots represent simultaneous activation of the corresponding actuators, with solid lines happening 
first, followed by dashed and then dotted lines. Each actuation lasts 200 ms, as per haptic envelope 
definition, and for each icon the pattern is repeated twice with a 300 ms interval between repetitions 


Fig. 10.4 Haptic envelopes 
of each individual actuation 
composing the icons: 50 ms 
attack time to 100% duty 
cycle, 150 ms sustain, and no 
release time 


Dutycycle (%) 


Time (s) 


This process lasted over several weeks, after which we finalized a set of ten 
tactons, depicted in Fig. 10.3. Each of the tactile icons consists of two repetitions of 
the same pattern which are separated by a fixed time interval. The tactons have a 
total duration which varies from 1.5 to 2.7s. For the individual temporal properties, 
we chose a fixed envelope for all the actuations which features 50 ms of attack, 
150 ms of sustain at maximum intensity, and no release time (see Fig. 10.4). We 
decided to keep the vibrotactile envelope parameters fixed for this initial phase of the 
project to facilitate the tactons’ learning phase. These tactile icons were proposed to 
undergraduate music students—a saxophone player (performer 1) and a guitar player 
(performer 2)—who were the participants for the ensuing evaluation sessions. 

The symbolic tactons we designed for the belt do not carry any musical or other 
meaning per se, and need to be learned by the performers to be proficiently used to 
convey musical information. These icons can be mapped to several musical functions, 
such as chords or sections of a piece, and these mappings also need to be mastered 
by musicians to be correctly interpreted. 
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(a) The crescendo tacton is achieved by means 
of exponentially increasing the duty cycle 
from 20% (perceptual threshold) to 100% over 
2000 ms. 


Dutycycle (%) 


Time (s) 


(c) The staccato tacton is obtained by presenting 
three, 100 ms long vibrations at 100% duty cy- 
cle, with a 100 ms interval between each peak. 
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(b) The envelope for the decrescendo tac- 
ton goes from 100% to 20% duty cycle over 
2000 ms, by using a negative exponential func- 


tion. 


Time (s) 


Dutycycle (%) 


(d) The legato tacton features 2 periods of a 
scaled sine wave going from 20% to 100% over 
1000 ms. 


Fig. 10.5 Schematization of the envelopes of the four musical tactons developed for the single- 
actuator band 


10.3.2.2 Musical Tactons 


While the symbolic tactons were designed by first creating geometric and temporal 
patterns for the vibrotactile stimuli which could later be mapped arbitrarily to musical 
functions, design of musical tactons for the single-actuator band took the opposite 
approach. For these, we started by determining the set of musical information this 
actuator would deliver. From experiences we gathered in our previous work [15], 
we hypothesized that a single-actuator configuration could be used to provide tempo 
cues, as well as information about articulation and dynamics. 

Using the heuristic approach based on iterative feedback from A.-K. Barbeau, we 
designed a set of four musical tactons associated with crescendo, decrescendo, stac- 
cato, and legato, respectively, which are shown in Fig. 10.5. These tactons contained 
a musical meaning attached to the temporal properties of the tacton itself and would 
ideally require a minimal effort to be correctly interpreted. 


10.3.2.3 Preliminary Evaluation 


We conducted a preliminary evaluation of both symbolic and musical tactons’ design 
with our two musicians, who performed a series of musical tasks we associated with 
each of the icons. It was important for us to evaluate the learnability and recognition 
rate of the tactons in the context of music performance in order to establish if musi- 
cians actively engaged in a musical task could reliably recognize and respond to the 
given tactile icons. 
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We performed two testing sessions, two weeks apart, following a methodology 
similar to the one reported in [13]. The musicians had 20 min per session to familiarize 
themselves with the tactons. Subsequently, they were asked to perform two recog- 
nition tasks. In task 1, they experienced a series of tactons and verbally reported 
the name or number of the tacton they thought they had perceived. In task 2, the 
musicians were given a score, shown in Fig. 10.6, and asked to perform the melody 
associated with the perceived icons. The melodies were composed to be easy to 
sight-read and perform. In the first session only symbolic tactons were tested, while 
in the second session we tested both symbolic and musical tactons. Performances 
were audio-recorded and subsequently analyzed to determine recognition rates of 
the tactile icons in both sessions. 


Session 1 


Two repetitions of task 1 were performed 10min apart. The results are depicted in 
Fig. 10.7a and show the average recognition rate of twenty randomly ordered tactons 
for each of the two repetitions. For the first trial, the two musicians correctly identified 
86 and 77% of the tactons, respectively. In the second repetition, both performers 
achieved 88%. 

For task 2, we provided the musicians with the score shown in Fig. 10.6. This time 
we asked them to play the melody corresponding to the perceived tactile icon. The 
musicians were free to play at the tempo they desired. Fifteen randomly ordered icons 
were tested, and a new icon would be delivered via the belt while the musician was 
playing the half note ending the previous melody. Task 2 was repeated three times, 
10min apart, and the results are depicted in Fig. 10.7b. The performers reached, 
respectively, a 92 and 79% recognition rate for the first trial, 92 and 86% for the 
second trial, and 88 and 71% for the last trial. It is notable that the results declined 
for both performers in the third trial, factors for which we discuss in Sect. 10.3.2.4. 


Session 2 


A second session took place two weeks after session 1, testing both symbolic and 
musical tactons. Following the previously described protocol, we performed task 1 
first, whose results are depicted in Fig. 10.8a. 

For task 2, the musicians wore the belt and the single-actuator elastic band on their 
left upper arm. A symbolic icon would be delivered via the belt, followed by a musical 
icon from the single actuator. The musicians were asked to play the corresponding 
melody following either the articulation or the dynamics indicated by the musical 
tacton. Results are shown in Fig. 10.8b. For the symbolic icons, the first performer 
reached a recognition rate of 87% in the first trial, 86% in the second, and 70 and 
78% in the third and fourth, respectively. A similar trend can be observed for the 
musical icons, with a 100% recognition rate in the first repetition, followed by 92, 
82, and 88% in the last three trials. The second musician performed less well in this 
task, reaching a 78% recognition rate for symbolic tactile icons in trial one, 71% 
for trial two, and 76 and 77% for trials three and four, respectively. For the musical 
tactons, only 25% of the tactile icons were correctly recognized in trial one, 66% in 
trial two, and 77 and 57% in trials three and four, respectively. 
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Fig. 10.6 Set of 10 simple melodies, composed by A.-K. Barbeau and associated with the ten 
symbolic tactile icons. The performer would feel one of the tactons on the augmented belt and 


perform the corresponding melody 


10.3.2.4 Musician’s Feedback and Discussion 


The two testing sessions with the undergraduate musicians show several patterns: 
Performers’ recognition rate in both sessions was consistently over 80% for task 1, 
even after only 20 min of practice with the belt (consistent with findings in Enriquez 
and MacLean [13]). This suggests that for both the musical and the symbolic tactons, 


we were able to design learnable and distinguishable tactile icons. 


When looking at the data for task 2, in both sessions we can observe important 
differences between the two performers. Performer 1 consistently achieved better 
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Session | - Recognition Rate (Task 1) Session | - Recognition Rate (Task 2) 
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Performer Performer 


(a) Task 1: Verbally report perceived tacton. (b) Task 2: Play melody corresponding to the 
perceived tacton as indicated on the score in 
Fig. 10.6. 


Fig. 10.7 Recognition rates for session 1 for both task 1 and task 2. Recognition rate is consistently 
around 80% for both performers 


results than performer 2, who afterward reported that the task could become quickly 
overwhelming, especially in the second session. This suggests that the complexity of 
the task prevented performer 2 from simultaneously paying attention to both types of 
tactile icons while reading and playing the melodies on the instrument. Performer 2’s 
performance nonetheless improved over time, as visible in Fig. 10.8b, going from a 
25% recognition rate for the musical icons in trial one to almost 80% in trial three. 

Participant | scored above 80% in most of the tasks across the two sessions, and 
two trends can be identified: For both sessions, performer 1’s performance in the 
musical task decreased in trial three, compared to the first two trials. This might be 
due to the presence of adaptation effects which would decrease the sensitivity to the 
tactile icons. The musician stated that the tasks were not too demanding and that the 
icon design allowed to easily differentiate the tactile effects. 

Overall, the variation between the two participants could be caused by different 
levels of proficiency on their instrument and ability to sight-read, despite their similar 
self-assessed musical expertise: Participant 1 was very confident in the sight-reading 
and performance of the melodies we proposed, while for participant 2 this task proved 
to be quite demanding, as demonstrated by the frequent hesitation in performing the 
given melodies which can be heard in the audio recording of the testing sessions. The 
different postures adopted by the two musicians when playing the saxophone and 
the guitar, respectively, could also be partly responsible for the variation between 
the two participants, but this aspect would require an investigation conducted on 
a larger group of musicians. Additionally, the limited number of repetitions and 
subjects makes it difficult to draw definitive conclusions about significant trends 
over repetitions, as randomness may have had an impact on the results. 
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(b) Task 2: Play melody corresponding to the perceived tacton as indicated 
on the score in Fig. 10.6. 


Fig. 10.8 Recognition rates for session 2 for both task 1 and task 2. Both symbolic and musical 
tactons were tested in this session. Results show recognition rates consistently around 80% for 
participant 1, while participant 2 performed less well in task 2 
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The observations reported above indicate that a satisfying degree of tactile icon 
recognition can be reached for both musical and symbolic tactons during the perfor- 
mance of a musical task, provided a high degree of confidence and expertise on the 
performer’s side. While all the musical tactons were equally well recognized during 
the two testing session, symbolic tactile icons 5 and 6 were the most problematic 
ones in terms of recognition rates. Tacton 5 would often be confused with tacton 9 
since, as reported by performer 1, the vibration coming from the two actuators on the 
sides would sometimes go unnoticed. This could be due to lower skin sensitivity in 
the waist area, which, combined with its peculiar geometrical pattern, made tacton 
6 also difficult to recognize at times. 

Ultimately, our results confirm that the transparency of a tacton [32] is not an 
absolute property of the tactile icon itself, but is very much influenced by the global 
context in which tactile information is being transmitted to users and to their available 
cognitive resources [44]. 


10.3.3 Implementation into Live Performance 


Following the evaluation sessions, the wearable score system was put into practice 
with a performance of 40 Icons about Art/Music composed by Sandeep Bhagwati 
and performed by trombonist Felix Del Tredici. The piece was the first étude to be 
composed for the augmented belt [17] and consisted of ten random repetitions of 
four musical tasks, each associated with one of the four symbolic icons chosen from 
the ten described in Sect. 10.3.2.1. In rehearsals, we worked with the performer to 
identify the set of four tactons to be used for the piece, which led to the selection 
of tactons 2, 3, 4, and 6 in Fig. 10.3. During the performance, a tacton would be 
delivered to the performer via the belt. He then had to execute the associated task 
once the corresponding tactile icon was recognized. 

Following the performance, we asked the performer about his experience during 
the piece. He found the four icons easy to recognize, while admitting that it took 
a considerable effort to pay attention to the vibrations coming from the belt while 
performing the musical tasks. 


10.4 Conclusions 


In this chapter, we presented a literature review of the use of haptic technology 
in music performance. Our focus was the design and implementation of solutions 
incorporating active vibrotactile feedback and stimulation. We presented a threefold 
taxonomy of applications in this domain and provided examples for each one of the 
categories we defined: tactile notification, translation, and languages. 


Shttp://www.felixdeltredici.com/ (last accessed on Dec. 17, 2017). 
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In the second part of the chapter, we focused on tactile languages and presented 
the results achieved in Musicking the body electric, a multidisciplinary project in 
which we contributed by designing and evaluating the use of tactile icons to convey 
score information to expert musicians. Several researchers have evaluated the use of 
such icons. To our knowledge, no previous evaluation of the use of this type of tactile 
communications has been performed in the context of musical interaction. For our 
purposes, it was important to evaluate our approach in the performance of authentic 
musical tasks. The evaluation we presented shows that our design paradigms for 
the tactile icons allow for recognition rate consistently around 80% after 20 min of 
familiarization with the system. The musical tasks we proposed, on the other hand, 
seem to impact these recognition rates in a way that is dependent on the users’ musical 
expertise, and the effect of learning is visible already during a single session. 

Work continues on Musicking the body electric in all areas. Bhagwati composed 
Fragile Disequilibria [3], a piece for solo trombone and four spectators, for which 
new suit prototypes were designed with multiple ERM motors placed along the arms 
and legs, across the back and around waist. New materials and technologies are also 
being tested to design a more robust and flexible platform for haptic garments that can 
be adapted to a number of different performance contexts. In addition to prototypes 
developed specifically for this project, a new modular wireless tactile system has also 
been introduced, where an array of self contained, single-actuator devices called 
Vibropixels can be placed flexibly on a garment, allowing them to be moved or 
reconfigured depending on the application [24, 25]. Finally, new compositions are 
being created for the suits to explore some of the novel possibilities afforded by 
a vibrotactile score system, most notably the expanded use of physical space and 
movement among performers. 
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Chapter 11 A) 
Haptics for the Development ciei; 
of Fundamental Rhythm Skills, Including 
Multi-limb Coordination 


Simon Holland, Anders Bouwer and Oliver Hödl 


Abstract This chapter considers the use of haptics for learning fundamental rhythm 
skills, including skills that depend on multi-limb coordination. Different sensory 
modalities have different strengths and weaknesses for the development of skills 
related to rhythm. For example, vision has low temporal resolution and performs 
poorly for tracking rhythms in real time, whereas hearing is highly accurate. How- 
ever, in the case of multi-limbed rhythms, neither hearing nor sight is particularly well 
suited to communicating exactly which limb does what and when, or how the limbs 
coordinate. By contrast, haptics can work especially well in this area, by applying 
haptic signals independently to each limb. We review relevant theories, including 
embodied interaction and biological entrainment. We present a range of applica- 
tions of the Haptic Bracelets, which are computer-controlled wireless vibrotactile 
devices, one attached to each wrist and ankle. Haptic pulses are used to guide users 
in playing rhythmic patterns that require multi-limb coordination. One immediate 
aim of the system is to support the development of practical rhythm skills and multi- 
limb coordination. A longer-term goal is to aid the development of a wider range of 
fundamental rhythm skills including recognising, identifying, memorising, retaining, 
analysing, reproducing, coordinating, modifying and creating rhythms—particularly 
multi-stream (i.e. polyphonic) rhythmic sequences. Empirical results are presented. 
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We reflect on related work and discuss design issues for using haptics to support 
rhythm skills. Skills of this kind are essential not just to drummers and percussion- 
ists but also to keyboards’ players and more generally to all musicians who need a 
firm grasp of rhythm. 


11.1 Introduction 


The role of the sense of touch in musical skills and the use of haptic devices to 
support musical activities are explored throughout this book. In this chapter, we 
focus on the use of haptics for learning fundamental rhythm skills, in particular 
skills typically learned though multi-limb coordination. The motivation for using 
haptics for this purpose relates to the different strengths and weaknesses of different 
sensory modalities. Vision is poor at tracking rhythms in real time, due to its lack of 
fine temporal discrimination, while hearing is considerably more accurate. However, 
when learning to recognise and play multi-limbed rhythms, neither hearing nor sight 
is well suited to communicate which limb does what and when, or how the limbs 
coordinate to form complex patterns. This is an area in which haptics can excel, 
by applying separate haptic signals to individual limbs. With this goal in mind, we 
have developed a system called the Haptic Bracelets and explore several applications 
in this chapter. The Haptic Bracelets are wearable haptic devices designed to help 
people learn multiple simultaneous (i.e. polyphonic) rhythmic patterns. Although 
the bracelets are fundamentally simple in conception, and although they make use 
of elements common in other haptic systems, in some respects they occupy a little 
explored part of the design space. In particular, they require different aspects of human 
cognition, perception and motor skills to be taken into account when considering 
some of the opportunities and affordances they present. 

In simple terms, the bracelets are wearable haptic devices designed to be worn 
by an individual (or, for some applications, by pairs of individuals, or groups) on all 
four limbs (two wrists and two ankles). Each bracelet contains (Fig. 11.1): a high- 
resolution inertial measurement unit (IMU)!; precise, fast acting vibrotactiles” with 
a wide dynamic range; a processor; and a Wi-Fi module (RN-XV Wi-Fly*). Each set 
of four bracelets is coordinated by a master processor, typically on a smartphone or 
laptop. Where more than one user is involved, master processors communicate with 
one other. 

In terms of basic operation, the bracelets can sense what actions a drummer is 
making with each limb and when. This can also be directly communicated from 
one drummer to another, as explored below. The bracelets have a range of musical 
applications, which we will consider in depth in this chapter, including the following: 


e The Haptic iPod; 


‘Inertial measurement units typically combine accelerometers, gyroscopes and magnetometers. 
2In the present chapter, the term “vibrotactile” is often used as a noun to mean “vibrotactile actuator”. 
3A now discontinued Wi-Fi solution. 
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Fig. 11.1 A Haptic Bracelet, displaying the internals 


e Drum teaching, with matching sets worn by teacher and learner; 
e Musician coordination and synchronisation; 
e Teaching multi-limb drum patterns by multi-limbed haptic cueing. 


The above applications can be valuable not just to drummers, but to any musicians 
who need a firm grasp on how rhythmic patterns interlock. Arguably, this applies to 
all musicians, but especially to those who play polyphonic instruments or who have 
complex rhythmic interactions with other players. 

Interestingly, the Haptic Bracelets have also found applications in the digital 
health domain, particularly in rehabilitation for sufferers from a range of movement- 
related neurological conditions including stroke, Parkinson’s, Huntingdon’s and 
brain trauma [1—4]. However, this is mostly outside of the scope of this chapter. 

There is a wealth of existing research on the use of haptics for communicating 
different kinds of information, for example notifications [5], posture improvement 
[6, 7], tempo synchronisation among musicians [8, 9] and more generally for con- 
veying information about different categories of phenomena such as forces [10], 
shapes, textures, moving objects, patterns and sequence ordering, as reviewed in 
[11]. Conversely, there is rather less research on the use of haptics for communicat- 
ing precise temporal patterns, especially multiple simultaneous temporal patterns. 
Work in broadly related parts of the design space is reviewed in Sect. 11.5. 

In order to understand how people perceive and deal with rapid temporal pat- 
terns, it helps to be aware of theories of biological entrainment and neural resonance 
theory—both of which are reviewed in the next section. 


11.2 Motivation and Theoretical Background 


The motivation and theoretical background for the Haptic Bracelets is drawn from 
a variety of sources, as we explore below. The original motivation for the bracelets 
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came from music education, specifically Emil Dalcroze’s Eurhythmics. Theoretical 
insights came from research in music perception by Bamberger [12], Lerhahl and 
Jackendoff [13], and others, as well as from work in ethnomusicology by Arom [14]. 
Once various prototype versions of the bracelets were built [2, 3, 11, 15, 16], research 
from cognitive science, particularly theories of human biological entrainment and 
neural resonance theory, proved invaluable in understanding key aspects of how 
humans interact with the bracelets. 


11.2.1 Dalcroze Eurhythmics 


The Swiss music educator Emil Dalcroze (1865-1950) noticed that many of his 
students seemed to read and play music notation stiffly, as an abstract activity, with 
little evidence of feeling the rhythms in their bodies [17]. By contrast, when observing 
musicians in Algeria, he noticed that musicians seemed to feel music in their whole 
bodies, engaging more deeply with complex rhythms. Dalcroze devised a wide range 
of physical musical games, culminating in the educational system known as Dalcroze 
Eurhythmics,’ still widely influential and in use today [17]. Amongst other things, 
this involves students listening to music while moving arms and legs independently, 
to mirror movement in different simultaneous streams in the music. 


11.2.2 Metrical Hierarchies and Polyrhythms 


Further theoretical insights come from research in music perception and musicology, 
reflecting longstanding insights by musicians. To musical novices, musical rhythm 
may seem like “one event after another”. However, as Lerdahl and Jackendoff and 
other theorists demonstrated, nearly all Western music is governed by metre. Metre 
may be viewed as a series of hierarchically coordinated and exactly synchronised 
temporal layers—each typically highly regular—with interesting exceptions [18]. 
While there are vital other aspects to rhythm, for example figures, duration, dynam- 
ics, accents and syncopation, nevertheless this means that many aspects of coordi- 
nating rhythm can be effectively offloaded from the cognitive system and onto the 
sensorimotor system by learning to assign different regular repeating patterns to each 
limb.° This can be learned by starting with just two limbs and then adding additional 
limbs. In some non-Western musical traditions, polyrhythmic organisation is used 
instead of hierarchical metre. In this case, the temporal layers are not organised hier- 
archically—however, each layer is still typically highly regular, and periodically all 


4The band the Eurythmics was named after this educational approach. 

5Interestingly, in some special cases, a useful educational strategy can be to shift the memorisation 
load for multi-stream rhythms in the other direction, for example from limb movement onto language 
processing, e.g. by using linguistic mnemonics [11]. 
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of the layers still reach synchronisation points [14]. Consequently, the same princi- 
ples about moving load from the cognitive system onto the sensorimotor system are 
relevant. 


11.2.3 Cognitive Science: Entrainment and Neural 
Resonance 


In addition to domain-specific theories from music education, music psychology 
and musicology, various theories from cognitive science help to cast light on the 
Haptic Bracelets. The most widely applicable of these are the theories of embodied 
interaction [19] enactive cognition [20] and sensory motor contingency [21]. Broadly 
speaking, these theories focus not just on purely mental processes, but on the physical 
enaction of target skills and on sensorimotor interactions that engage the whole 
body and give participants multi-sensory feedback on how their actions affect their 
surroundings. However, there are two theories from cognitive science that have much 
more specific relevance to learning multiple simultaneous rhythmic patterns, namely 
the theories of biological entrainment and neural resonance, considered below. 

Entrainment is a term, originally from physics, to describe how two or more 
physically connected rhythmic processes interact with each other in such a way that 
they adjust towards and eventually “lock in” to a common periodicity or phase. 
However, the concept has been found to have rich and unexpected applications in 
perception, neuropsychology and music psychology at a variety of different levels 
[22-24]. At the interpersonal level, musicians have a strong tendency to entrain with 
each other when playing. This is more interesting than it might appear on the surface, 
because when two or more musicians play together—despite being demonstrably in 
time with each other—it may be the case that they rarely or even never play notes 
at the same time. In the case of entrained musicians, typically what is happening is 
that, instead of being entrained to the musical surface, both players are entrained to 
a beat (part of the metre or polyrhythm) that may often be implied rather than being 
explicitly sounded. 

To sharpen this point, most people, musicians and non-musicians alike are able 
to tap along metronomically to monophonic melody or rhythm. However, at many 
points where a tap sounds, there may be no surface event in the music. Conversely, 
there may be many events in the music at which no tap occurs. What is particularly 
interesting about this, for our purposes, is that the ability to extract a beat from an 
irregular musical surface appears to be an almost exclusively human ability (with 
notable exceptions identified below). Theorists have created diverse computational 
and psychological theories to try to account for this ability and for the musical 
ubiquity of metre and polyrhythm. The best current explanation comes from neural 
resonance theory. 

Neural resonance is a theory [23, 24] proposing that humans have a specialised 
neural organ, which consists of a bank of actively powered oscillators with temporal 
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periods covering the range from about 0.2 to 2 s. Many phenomena in music percep- 
tion can be well explained by the way in which these hypothesised oscillators tend 
to entrain with sensory input. Mathematical models of this organ, based on known 
characteristics of neural oscillators, are able to reproduce the results of human tap- 
ping experiments well, not just for metrical rhythms but also for polyrhythms [23]. 
The theory of neural resonance also helps to explain the origins of musical metre: 
given a simple regular external beat with frequency f, not just the neural oscillator 
with frequency f will entrain, but also, to a lesser extent, those with frequencies 2f, 
3f , f/2 and f/3. 

It was originally thought that beat extraction was unique to humans. Indeed, 
human neonates can extract beats at birth [24], whereas it has been evidenced by 
EEG studies that Macaque monkeys are unable to extract beats [25]. However, it 
was unexpectedly discovered [26] that speech-imitating birds such as the sulphur- 
crested cockatoo Cacatua galerita eleonora have expert beat extraction abilities. The 
vocal learning hypothesis [26] suggests that rhythmic entrainment abilities may have 
developed evolutionarily as a by-product of vocal learning mechanisms. 


11.3 Applications of the Haptic Bracelets 


In this section, we consider four categories of musical use of the Haptic Bracelets 
that we have prototyped and explored. There is some overlap, but the categories help 
to illuminate the design space and involve different software. 


11.3.1 The “Haptic IPod” 


One of the many uses of the Haptic Bracelets is as part of a portable Haptic Music 
Player or “Haptic iPod” (Fig. 11.2). For this application, the user listens to music 
on a smartphone, but with the crucial feature that, in time with the music, they can 
feel in the appropriate limb (by vibrotactile pulses, as detailed below) which limb 
the drummer uses to strike a drum and when. 

Users may engage with the system in a variety of ways to learn rhythms, for 
example by silently air drumming in time to the music, or if seated by tapping with 
hands and feet on nearby surfaces, or by “thigh slapping”—both recommended ways 
of learning rhythms [27]. It is straightforward for the system to sense virtual or actual 
impacts and to sonify with chosen drum sounds, should this be desired. 

For those wishing to improve their sense of rhythm, or multi-limbed rhythmic 
skills, this Haptic iPod application has the potential to be a compelling application, 
for the following reasons. 

In the case of drummers who are already expert, they can play what they feel 
(or imagine) because they have played and felt similar rhythms many times before. 
When hearing a rhythm being played by another drummer, they may recognise it as 


11 Haptics for the Development of Fundamental Rhythm Skills ... 221 


Fig. 11.2 A set of four Haptic Bracelets (lower left). Two users listening to music (right) and 
feeling what each limb of a drummer does and when—with the Haptic Bracelets acting as a Haptic 
iPod (upper right) 


something they can play—often already feeling in the imagination which limb should 
be playing which part of the multichannel rhythm. They have typically internalised 
a mental model of what a drummer’s arms and legs can do, by playing and listening 
over many years to rhythms, watching, hearing and trying to replicate what other 
drummers play. By contrast, for those with little or no drumming experience, the step 
between hearing a multichannel rhythm and learning to play it is not automatically 
coupled with the feel of what each limb does. This may not be a major obstacle 
when hearing a single channel rhythm, provided that the tempo is within limits, 
and the complexity of the rhythmic pattern falls within the range of what can be 
grasped and memorised. However, when rhythms involve multiple channels and 
require multiple limbs to be played in a coordinated manner, the task is much harder. 
In these circumstances, a lack of experience with how different limb movements can 
interrelate and with how different limbs are associated with different drum sounds will 
weaken the ability to transfer from hearing to playing. This is where haptics can offer 
a distinctive advantage. Coupling multichannel musical rhythms to multichannel 
haptics allows a person to feel the different channels in different limbs, thereby 
easing the transition from hearing to playing, via feeling. A similar rationale applies 
to all of the applications of the Haptic Bracelets considered below. 

Crucially, the theory of entrainment plays a key role in this explanation. In par- 
ticular, there is no suggestion that users will learn rhythms reactively by a process of 
stimulus response as in behavioural theories—reacting to each hit as it occurs. Such 
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a process would not be well suited to temporal synchronisation. Rather, for typical 
musical materials, the streams for each limb will tend to consist of, predominantly 
but not exclusively, short repeating patterns or figures. Consequently, after initial 
listening, users are generally able to entrain to and reproduce the streams (see Sect. 
11.4). 

For the prototype version of this system, a laptop running a DAW? was used rather 
than a smartphone, and the stereo audio track had an associated manually prepared 
synchronised MIDI track that mirrored the drum part. The MIDI drum tracks were 
used to drive the vibrotactiles on the bracelets, as seen in [29]. In future versions of 
the system, no manual pre-processing of the audio need be involved: software for 
automatic drum part extraction could be used—though this would identify drums 
rather than limbs, which has certain limitations—this design issue is discussed in 
Sect. 11.5. 


11.3.2 Drum Teaching with Haptic Bracelets 


The Haptic Bracelets operate rapidly enough to be used for real-time synchronisa- 
tion between musicians. This enables a drum teacher (Fig. 11.3, right) and learner 
(Fig. 11.3, left) to both wear a set of bracelets, and for the learner to feel in the 
appropriate limb which limb the drummer uses to strike each drum, effectively in 
real time [3, 29]. The impacts felt by each limb are detected in fast sensors, signals 
are sent by Wi-Fi, and the system uses fast acting, precise vibrotactiles. Figure 11.4 
shows the control interface for tap detection of each limb of the teacher’s devices 
mapping them to the learner’s bracelets. Consequently, communication delays are 
generally stable and under 10 ms. Taking into account the speed of sound in air, this 
means that synchronisation via the bracelets over a network can be as close as is gen- 
erally achieved by musicians playing at distance of 3.5 m from each other—which 
is considered real time for most musical purposes. Depending on the quality of the 
Wi-Fi router and other system factors, beats can exceptionally be delayed or lost, but 
because the key working principle is entrainment, occasional small disturbances do 
not matter greatly. 

Teaching in this way can be in person, over a distance, live or recorded, and one- 
to-one or one-to-many. Haptic Recordings can be played back later and slowed down 
for more detailed study, with limbs muted or isolated as needed. 


Digital Audio Workstation: A software programme for recording, editing and producing audio 
content. 
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Fig. 11.3 A drum learner (left) feeling what his drum teacher (right) is doing with each limb in real 
time. This particular photograph shows a silent air-drumming exercise, without drumsticks, with 
the learner looking away 


Fig. 11.4 A screenshot of the software for adjusting the tap detection of one haptic bracelet set and 
mapping it to another set 


11.3.3 Musician Coordination and Synchronisation 


The mode of operation, outlined above, of the Haptic Bracelets has more general 
applications for musician coordination and synchronisation. The Bracelets can be 
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Fig. 11.5 Rudimentary two-handed rhythm: paradiddle 
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Fig. 11.6 Syncopated rhythm: Cuban clave pattern 
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Fig. 11.7 Polyrhythm: three against four 


used to address the problem that, in complex situations, crucial cues between musi- 
cians can be missed in the recording studio or live on stage. 

Specific modes of use include silent count-ins, hierarchical or polyrhythmic click 
tracks, confirmation of correct device operation and inter-musician communication, 
and coordination generally. The idea of a silent count-in is straightforward and is not 
new: however, in the case of complex metres or complex polyrhythms, the bracelets 
allow silent hierarchical or polyrhythmic count-ins that explicitly enact up to four lay- 
ers of the metre or polyrhythm simultaneously to be felt in the appropriate limb. Hap- 
tic count-ins and section announcements could variously be driven by a metronome 
or MIDI score on a DAW, driven by a tapping foot, or by other physical actions of a 
musician, sounded or silent. In device feedback mode, the correct operation of foot 
pedals and other controllers can be confirmed by haptic feedback—a sophisticated 
version of this idea has been explored extensively by [28]. 


11.3.4 Teaching Multi-limb Drum Patterns by Multi-limbed 
Haptic Cueing 


The application of the Haptic Bracelets that we have explored most extensively is 
teaching multi-limb drum patterns (such as in Figs. 11.5, 11.6 and 11.7) using audio 
and haptic recordings, as studied in the next section. 
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11.4 Experimental Results 


In this section, we review a series of experiments carried out to test the applicability of 
haptics for learning rhythm skills. These experiments use a variety of technological 
and methodological set-ups; earlier experiments used wired systems [15, 29] and 
sense what drums are hit and when, whereas our later systems are fully wireless and 
sense which limbs move and when [3, 16]. 


11.4.1 Supporting Learning of Rhythm Skills with the Haptic 
Drum Kit 


Our first haptic guidance system was called the Haptic Drum Kit [15]. Its main 
aim was to support the learning of rhythm skills and multi-limb coordination while 
playing drums. 

The haptic pulses sent to a particular limb indicate the exact moments at which 
notes should be played with that limb, on a specified part of the drum kit, i.e. hi-hat, 
ride cymbal, snare drum or kick drum. Because each rhythm is played repeatedly 
in a loop, the user can listen to and/or feel the pattern before trying to play along 
with one or all limbs. In other words, the aim of our design is deliberately not to 
orchestrate stimulus response but rather to foster entrainment. 

The original Haptic Drum Kit system consists of the following: vibrotactiles 
attached to the wrists and ankles using velcro bands; a computer system that feeds 
signals to the haptic devices; a stereo audio system; and a MIDI drum kit, which is 
played by the person while wearing the haptic devices. 

The MIDI drum kit is connected to the computer running sequencing and recording 
software (Logic Pro) which allows playback as well as accurate data collection. In 
the study, MIDI files encoding drum patterns (known as “guide tracks”) were played 
back by the sequencer to control the generation of audio output and synchronised 
haptic output. The vibrotactile output signals were generated through a programme 
written in Max and an Arduino board, which was connected to the actuators by wires. 

Presentation was possible in one of the three following modes: audio only; audio 
plus haptics; or haptics only. The stereo audio system was used to play back both 
the sound created by playing the MIDI drum kit and the sound from the guide track, 
when required. In the study, the participants were also recorded on video from three 
different angles. 

To explore what kinds of rhythmic patterns could be supported best by using 
haptic guidance, twenty reference rhythms were selected as stimuli, drawn from four 
broadly representative technical categories: (1) metrical rhythms, i.e. 8 beat and 16 
beat; (2) rudimentary patterns that distribute continuous strokes across two limbs, 
e.g. the alternation of single and double strokes in the paradiddle (see Fig. 11.5); (3) 
figural rhythms, involving syncopation, based on the Cuban clave (see Fig. 11.6); 
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and (4) polyrhythms, e.g. 2 versus 3, 3 versus 4 (see Fig. 11.7), 2 versus 5, 4 versus 
5. The rhythms included patterns for two, three and four limbs. 

Afterwards, a structured interview was carried out with each participant to explore 
their views on the Haptic Drum Kit and the three conditions used in the experiment. 
Of the five participants, four were beginners, while one had five years of experience 
drumming in rock bands and taking drumming lessons. 

Although there were some interesting individual differences (see [15] for details), 
the results can be generally summarised as follows. All participants expressed an 
interest in using the Haptic Drum Kit again, and most found the system comfortable 
to wear. However, all participants found the audio clearer than the haptic presentation 
to attend to, and all found it easier to play in time with the audio than the haptic stimuli. 
Of the three forms of presentation (audio only, haptic only and audio plus haptic), 
all preferred audio plus haptic, indicating that the haptics were considered to have 
added value. 

The vibrotactile drivers for this version of the Haptic Drum Kit (version 1) 
appeared to have three weaknesses for our purposes, according to feedback from 
the five participants in the study: (1) the haptics were not felt clearly enough, espe- 
cially on the ankles; (2) the attack of the haptic pulses was somewhat blurred, making 
it difficult to recognise the precise timing of a note to be played; and (3) there was 
no relative emphasis of haptic pulses, which made it hard to clearly differentiate the 
beginning of the looping pattern. 


11.4.2 Learning Multi-limb Rhythms with Improved Haptic 
Drum Kit 


To address the weaknesses of the first version of the Haptic Drum Kit, an improved 
version was developed. This second version of the Haptic Drum Kit employs four 
C2 tactors’ as the vibrotactile devices. They use linear resonant actuators (LRAs) 
rather than the more common eccentric rotating mass (ERM) actuators, which allows 
tactors to deliver very clear haptic signals with very low start-up time (around 4 ms). 
Details on those actuator technologies can be found in Sect. 13.2. These are secured 
to the limbs using elastic velcro bands. As with the earlier version of the system, a 
MIDI drum kit is used to play and record the drum sounds. 

An experiment was carried out using this system with 16 participants (eleven with 
varying degrees of drumming experience, five without) to see whether this version 
was more suitable for our purposes and to explore in more detail the effects of haptic 
guidance on learning of rhythms, for four different kinds of rhythmic stimuli that all 
require multi-limb coordination. These stimuli form a subset of the rhythms used in 
the previous study: 


e Linear rudiments (e.g. paradiddle); 


Thttps://www.eaiinfo.com/tactor-landing/ (last accessed on November 8, 2017). 
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e Metrical rhythms (8 beat rock rhythms); 
e Figural rhythms, involving syncopation, based on the Cuban clave; 
e Polyrhythms, e.g. 2 versus 3, 3 versus 4, 2 versus 5, 4 versus 5. 


After the playing sessions, questionnaires were used to gather participants’ feed- 
back on the different conditions. During subsequent analysis, the participants’ per- 
formance was manually scored by an experienced percussionist in terms of accuracy 
and timing, and times were recorded for the moment at which a particular pattern 
was first attempted and when it was first played correctly. 

The results of this study were very encouraging. They indicated that haptic stimuli 
can be used as a reasonable alternative for audio stimuli in drumming instruction for 
the various kinds of rhythms employed, achieving similar results in terms of learning 
speed, i.e. the time required to learn to play an exercise correctly. For accuracy, 
there were individual differences which seemed related to the participants’ previous 
experience in drumming and playing along with metronomes. 

For less experienced drummers, accuracy was highest in the haptic condition and 
lowest in the audio condition, while for the most experienced drummers there was 
little difference between conditions. Regarding timing, beginners performed best 
with audio plus haptics, whereas experts performed best with audio only. The data 
from the questionnaires showed that haptic guidance for multi-limbed drumming 
was generally well liked, and given a choice between audio, haptic or both audio and 
haptic presentation, 14 participants preferred audio plus haptic. Most participants 
enjoyed using the Haptic Drum Kit, found the tactors comfortable to wear, and all 
except one said they would like to use the system again. 

Comparing different haptic devices, i.e. the vibrotactiles used in version (1) and 
the tactors used in version (2), the tactors provided better results, both in terms of 
observable performance and subjects’ attitudes. 


11.4.3 Passive Learning of Multi-limb Rhythm Skills 


To find out whether haptically supported learning of similar multi-limb rhythm skills 
could also take place while the learner is attending another task, away from the 
drums, an experiment was carried out to investigate the possibility of passive learn- 
ing of rhythms while reading [11]. Fifteen people participated in the experiment 
(eight men and seven women), aged 15-51. Three were experienced drummers (with 
approximately 10 years of experience playing the drums), five had a little drumming 
experience, and seven had no experience with drumming. 

The technology used in this study was an early version [29] of the Haptic Bracelets. 
For practical reasons, the system used for this study was wired and stationary, to 
ensure the maximum possible reliability of timing data. This version of the Haptic 
Bracelets employed C2 tactor vibrotactiles attached to each wrist and ankle, using 
elastic velcro bands. The tactors were driven by multichannel signals from a DAW. 
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The experimental procedure consisted of a pretest phase, a passive learning phase 
and a post-test phase, as follows. In the pretest phase, participants were asked to play 
a series of six rhythms (requiring multi-limb coordination, as in the previous study) 
on a drum kit, guided simply by audio recordings. These performances provided a 
base reference for later comparisons. During the following passive learning phase, 
away from the drum kit, participants were asked to carry out a 30-min reading 
comprehension test. Participants were asked to focus on getting the best possible 
scores on the comprehension test. 

During the comprehension test, just two of the six rhythms from the set were 
haptically “played” (without audio) to each subject via the vibrotactiles attached to 
wrists and ankles. Different pairs of rhythms were chosen for different subjects, so 
that clear distinctions could be made in the next phase. Within that constraint, in 
order to present an adequate challenge for each subject, choices were made of more 
or less rhythmic complexity to reflect different levels of previous playing experience. 

In each case, the two rhythms were played repeatedly, alternating every few min- 
utes. In the post-test phase, subjects were asked to play again at the drum kit the 
complete set of rhythms from the pretest, including the two rhythms to which they 
had been passively exposed. Finally, a questionnaire was used to gain feedback from 
the participants about their experiences during the experiment and their attitudes 
towards the Haptic Bracelet technology. 

The results from the participants’ subjective evaluations can be summarised as 
follows (for detail, and the complete set of responses from which a selection is 
provided here, see [11]). 

Most participants thought that the technology helped them to understand rhythms 
and to play rhythms better, and most preferred haptic to audio to find out which limb 
to play when. Most participants indicated that they would prefer using a combination 
of haptics and audio for learning rhythms to either modality on its own. 

Interesting quotes from participants in response to the open question “Are there 
things that you liked about using the technology in the training session?” included 
the following, all from different participants: 


It helped to differentiate between the limbs, whereas using audio feedback it is often hard to 
separate limb function. 


Clarity of the haptics. ‘seeing’ the repeated foot figure in the son clave. 


Being able to flawlessly distinguish between which limb to use. The audio is more confusing. 


The question “Are there things that you like about the haptic playback?” resulted 
in responses such as the following: 


It makes the playing of complex patterns easier to understand. 
Easier to concentrate on the particular rhythms within a polyrhythm (than audio only). 


That you could easily feel which drums you needed to play when and how quickly it went 
on to the next beat. 


The answers from participants to the question “Are there things that you don’t 
like about the haptic playback?” included the following: 
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repetition gets irritating ‘under the skin’ 
The ankle vibrations felt weak on me and I had to concentrate hard to feel them. 
Just initially strapping on the legs. [Lack of] portability. 


All quotes above are selected from [11]. 


In other words, there seems to be room for improvement in the feeling of the 
haptics and the straps, especially after longer use, the inconvenience of the wires and 
personally adjustable strength levels for the haptic signal for each limb. The last two 
points have already been addressed in more recent versions of the Haptic Bracelets, 
which are portable, wireless, and have individually adjustable levels. 


11.5 Related Work 


As noted earlier, there is much research on the use of haptics for communicat- 
ing different kinds of musical information, for example notifications [5], posture 
improvement [7], tempo synchronisation [8, 9], haptic guidance or augmentation in 
general [30-32] (see also Chaps. 6, 8, 9, 12, 13 and Sect. 10.3) and the effect of 
haptic feedback on quality perception and user experience [33, 34] (see also Sect. 
5.3.2.2, Chaps. 6 and 7). However, in this section we focus principally on haptics for 
rhythm skills, particularly, though not exclusively, as regards multiple simultaneous 
streams of rhythms. We will group broadly representative strands of research in this 
area as follows: 


e haptic metronomes, 
e haptics applied to multiple parts of the body (or the whole body), 
e haptics for non-metronomic temporal sequencing. 


Having reviewed the approaches used in this work, we then compare and con- 
trast them with modes of use of the Haptic Bracelets (as considered in Sect. 11.3). 
The resultant contrasts help to illuminate various design dimensions for haptics for 
developing rhythm skills. 

One straightforward use of haptics in developing rhythm skills is as haptic 
metronomes. Recently, commercial versions of haptic metronomes have come on 
the market.* Giordano and Wanderley [9] demonstrated formally that musicians can 
reliably follow a tempo set by a haptic metronome. This research showed that devi- 
ation from target inter-onset interval was comparable between the auditory and the 
tactile modality. 

Several projects have applied haptics to multiple areas of the body for music- 
related purposes, sometimes via specialised haptic garments [35] (see also Sect. 10. 
3) and even via furniture [36]. However, the emphasis in these projects is generally not 
on multi-stream rhythm skills. In many cases, the focus is on exploring novel aesthetic 


8For example, the Soundbrenner Pulse http://www.soundbrenner.com and the Peterson BodyBeat 
Pulse https://www.petersontuners.com/shop/Metronomes/ (last accessed on November 8, 2017). 
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haptic perceptual effects, such as in the case of [37, 33]. In some projects of this kind 
[36], the focus is strongly on Deaf culture,’ and on the use of crossmodal devices 
and sensory substitution [38] to convey musical information through sense of touch, 
particularly for the profoundly deaf. In this context, Fulford [39] has investigated the 
extent to which tonal intervals can be accurately communicated by touch. Jack et al. 
[37] have collaborated with Deaf arts activists to produce furniture that translates 
pitch, rhythm, loudness and timbre to whole body vibration in psychometrically 
well-informed ways. 

Some work applying haptics to the whole body (or large parts of the body) may 
have some implications for improving skills related to multi-stream rhythms. An 
interesting example is a tension-based wearable vibroacoustic device by Yamakazi 
et al. [40]. This device uses a cord worn around the chest, whose tension is adjusted 
by DC motors directly driven by an amplified analogue audio signal. This system 
permits the communication of an acoustic signal with finely detailed bass clarity into 
the entire chest cavity. Users scored the experience favourably particularly in music 
with prominent bass drum parts. Although this system does not spatially separate 
multiple rhythms, its bass clarity may help wearers in separating low-pitched rhythm 
parts. 

A contrasting system with clear potential relevance to skills multi-stream rhythm 
skills is MuSS-bits by Petry et al. [41]. Designed with deaf users in mind, this system 
uses wireless sensor—display pairs that map audio microphone signals more or less 
directly to the voltage applied to vibrotactiles, which can be attached anywhere on 
the body. 

One strand of work has focused on haptics for temporal sequencing—particularly 
for monophonic rhythms and monophonic melodies—though recently the scope has 
widened [42, 43]. Huang et al. [44, 45] and Siem et al. [46, 47] carried out a series 
of studies looking at passive learning (i.e. learning without conscious attention) of 
tasks involving sequential key presses, such as typing or playing piano melodies. A 
lightweight wireless haptic system was developed for the purpose, with a fingerless 
glove containing one vibrotactile per finger. This system was used to teach sequences 
of finger movements to users, while they performed other tasks. A sequence of finger 
movements learned in this way, if subsequently repeated with the five fingers placed 
over five adjacent keys on a musical keyboard, serve to play a monophonic melody. 
Target melodies were typically restricted to five pitches, so that no horizontal move- 
ment of the hand (as opposed to vertical movement of the fingers) was needed. Sample 
melodies contained rests and notes of different durations. A study demonstrated that 
passive learning with audio and haptics combined was significantly more effective 
than audio only. A more recent study [47] involved passively training both hands 
simultaneously with material that was monophonic in the right hand but included 
simple repeating two note chords in the left hand. This work demonstrated that users 
may learn to play tunes for both left and right hand’s tunes at once via passive haptic 
learning. The work by Grindlay [42] focused on passive learning of monophonic 


° Deaf culture (with a capital D) refers to a set of cultural values, behaviours and traditions associated 
with deafness viewed as a distinctive and valuable human experience, as opposed to a disability. 
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drum rhythms, with a mechanical installation providing haptic guidance by automat- 
ically moving a single drumstick held by the learner. The results of this study showed 
that the system supported learning of rhythms which can be played with one hand. 

A project that takes involuntary control of a learner’s movements to extremes is 
the Possessed Hand [48]. This system allows control of a user’s finger movements by 
applying electrical stimuli to the associated muscles using a belt with 28 electrode 
pads placed around the forearm. The makers suggest this system could be applied 
to musical applications, in particular learning correct hand posture for playing the 
piano or koto, but they mention there are issues to be considered related to reaction 
rate, accuracy and muscle fatigue. This research is highly unusual in terms of the test 
subjects’ comments, which include “Scary... just scary” and “I felt like my body 
was hacked” [48, p. 550]. 

As noted earlier, we will now compare and contrast the above work with various 
modes of use of the Haptic Bracelets in order to illuminate various dimensions of 
the interaction design space for the haptic support of rhythm skills. 

One such design dimension contrasts metronomic cueing versus interpersonal 
rhythmic interaction. Commercial haptic metronomes are excellent tools for practis- 
ing to a beat. Like the Haptic Bracelets, they can allow several musicians wirelessly 
to coordinate by sharing a common haptic metronomic beat or to be coordinated 
by cues from a MIDI score on a DAW. However, the current commercial haptic 
metronomes cannot track live limb movement so cannot, for example, deliver real- 
time multi-limb polyphonic drumming instruction from a drum teacher, as in the 
case of the Haptic Bracelets (Sect. 11.3.2). For many purposes, metronomic cueing 
is sufficient, but live intrapersonal entrainment affords additional expressive, musical 
and educational possibilities. 

A second design dimension involves the contrast between discrete versus ana- 
log haptic mapping. By analog mapping, we refer to simple mapping of an audio 
signal—typically amplified and filtered—to a vibrotactile transducer, as opposed to 
representing rhythmic events by discrete pulses. In the case of [41] and much of the 
work aimed at whole body experience or Deaf culture, the haptic signals are typi- 
cally more or less direct mappings of audio signals. By contrast, the Haptic Bracelets 
and commercial haptic metronomes use discrete haptic signals to represent events 
in rhythmic patterns. Discrete haptic signals need not be uniform—they can have 
different intensities, lengths and envelopes, for example to represent accents or tex- 
tures when driven by a MIDI score. Analog haptics can communicate greater subtlety 
of texture, and continuous (as opposed to discrete) signals play important roles in 
deliberately designed haptic perceptual illusions [36]. However, for some purposes 
discrete pulses can give useful simplicity to the representation of discrete musical 
events. 

Choices in the system used for sensing rhythmic events can have interesting design 
implications when representing polyphonic rhythms, especially when taking cues 
from a live drummer or teacher. MuSS-bits [41] offers an instructive contrast in this 
respect with the Haptic Bracelets. MuSS-bits uses analog wireless sensor—display 
pairs that map microphone signals directly to vibrotactiles. Such a system can readily 
be used to route different haptic signals onto different limbs, but a simple microphone 
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is less well suited to detecting which limb is striking a drum and when, and better 
suited to detecting which drum has been struck. This can have advantages in situations 
where the same limb plays more than one drum, but can have disadvantages where, 
for example, two limbs alternate in their playing of a single drum (Fig. 11.5). 

Yet another design dimension involves the choice of body location(s) when apply- 
ing haptics. Different locations have different advantages for different applications. 
For example, as noted earlier, the tension-based system by Yamakazi et al. [40] allows 
clear communication through the chest of highly detailed bass vibrations, whereas 
Lewiston [43], Huang et al. [44, 45] and Siem et al. [46, 47] focus on individual 
fingers, and the Haptic Bracelets focus primarily on the limbs. MuSS-bits by con- 
trast emphasises flexibility in choice of body locations for its wireless sensor—display 
pairs. Choice of body location for haptics can have a variety of subtle effects on the 
perception of haptic signals beyond the scope of this chapter—a general discussion 
of this issue can be found in [49]. 

Finally, there is an important difference between the work by Grindlay [42], 
Tamaki et al. [48] and our own, related to the dimension of control. Although very 
different, their systems are both able to physically control human movements, while 
in our work (and most other related work) the haptics only communicate signals to 
guide the user’s movement, and the user remains in control of all physical actions. 


11.6 Conclusions 


Music is an evolutionarily ancient human activity [50], and rhythm plays a funda- 
mental role in it. Understanding and playing several rhythms simultaneously is one 
of the most challenging rhythm skills to learn. In this chapter, we have argued that of 
all the sensory modalities, touch has a special role to play in learning and teaching 
multi-limbed rhythms. This is because it allows different rhythmic components to 
be directly experienced simultaneously but separately in the relevant limbs. When 
experiencing rhythms haptically in this way, users find it relatively easy to mentally 
direct their attention to the sensations in any single limb or arbitrary combinations of 
limbs [11]. In many other musical applications of haptics, the user is simply called 
upon to be reactive, e.g. to respond to notifications, feedback or guidance, or to 
passively experience aesthetic effects. By contrast, the use of haptics in support of 
rhythm skills draws on sophisticated predictive skills, in particular the distinctively 
human capability of biological entrainment. 

For the above reasons, we designed and built a series of systems, starting with 
Haptic Drum Kit and more recently the wireless Haptic Bracelets [3, 16]. We have 
used these systems to study new ways of learning rhythm skills. They all provide 
multiple streams of haptic signals to the body using vibrotactile devices around the 
wrists and ankles to guide the timed movement of these limbs in time with repeated 
rhythmic stimuli. The development of this work was inspired by research from various 
fields, including music education (e.g. Dalcroze Eurhythmics), musicology, music 
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psychology and cognitive science, in particular the theories of biological entrainment, 
and neural resonance. 

In this chapter, we have described several applications of the wireless Haptic 
Bracelets, including: (1) a portable Haptic Music Player, or “Haptic iPod”, which 
provides four channels of vibrotactile pulses that track drum parts in time with the 
music; (2) live interactive drum teaching with Haptic Bracelets worn by both teacher 
and learner, enabling the learner to feel in the appropriate limbs what the teacher is 
playing; (3) musician coordination and synchronisation, using the Haptic Bracelets 
to communicate musical cues such as count-ins, multichannel click tracks or section 
announcements in situations where audio may not be appropriate, such as recording 
studios or live on stage—these may be driven by a metronome, DAW or physical 
actions of a musician; and (4) teaching multi-limb drum patterns by multi-limbed 
haptic cueing. 

Focusing on the last type of application, we have carried out three empirical studies 
with different versions of the Haptic Drum Kit and Haptic Bracelets to evaluate their 
usability and usefulness for this purpose. There was evidence that: 


e haptic stimuli can be used to learn to play a variety of multichannel rhythms, 
generally taking the same amount of time to learn as via audio alone, 

e there was an overwhelming preference for haptics plus audio (compared with audio 
alone) for learning multi-limb rhythms, 

e most participants preferred haptic to audio to find out which limb to play when, 

e novices in particular benefit from haptics, compared to people with more drumming 
experience, 

e participants considered that passive haptic playback of rhythms while reading 
helped them to better understand and play those rhythms. 


Compared to related work on using haptics for music education, our approach 
seems to be unique in the focus on supporting the acquisition of rhythmic skills that 
involve multi-limb coordination by providing multichannel haptic signals to both 
wrists and ankles, although the Haptic Bracelet technology is flexible enough to 
support a range of other applications. 

Several areas of further research are suggested by this work, with relevance to 
various disciplines, including music perception, cognition and production; music 
education; music and the deaf; human synchronisation; sports science; neuroscience; 
and digital health. More empirical studies are needed to better understand factors that 
may affect the learning of multi-limb rhythm skills, including: 


e different locations for placing haptic transducers on the body, 

different strategies for haptically separating multi-limb drum parts (e.g. by drum 
versus by limb), 

different vibrotactile technology, 

analog versus discrete haptic encodings of rhythms, 

the optimisation of discrete haptic “timbres” and intensities, 

conditions promoting active versus passive haptic learning. 
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More attention is needed to factors such as different levels of drumming experi- 
ence; the selection of rhythms and types of guidance provided (audio, haptic, visual 
or combinations). Better techniques are needed for automated analyses of drum- 
ming performance, characterising timing and accuracy in coordination of the limbs. 
We need to better understand the interplay between cognitive (e.g. symbolic) and 
embodied (e.g. Haptic Bracelets) approaches to internalising multiple simultane- 
ous rhythms. Other directions for future work include investigating music-teaching 
applications that make use of the increased level of interactivity between teachers 
and learners provided by systems such as the latest version of the Haptic Bracelets. 
These systems may have particular relevance for deaf musicians. Finally, more work 
is needed on applications of the Haptic Bracelets in therapeutic settings in the health 
domain, e.g. combining musical stimuli with haptic guidance to support rehabilita- 
tion of walking skills for survivors of stroke and other neurological conditions. 
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Chapter 12 A) 
Touchscreens and Musical Interaction E 


M. Ercan Altinsoy and Sebastian Merchel 


Abstract Touch-sensitive interfaces are more and more used for music production. 
Virtual musical instruments, such as virtual pianos or drum sets, can be played on 
mobile devices like phones. Audio tracks can be mixed using a touchscreen in a DJ 
set-up. Samplers, sequencers or drum machines can be implemented on tablets for 
use in live performances. The main drawback of traditional touch-sensitive surfaces 
is the missing haptic feedback. This chapter discusses if adding specifically designed 
vibrations helps improve the user interaction with touchscreens. An audio mixing 
application for touchscreens is used to investigate if tactile information is useful for 
interaction with virtual musical instruments and percussive loops. Additionally, the 
interaction of auditory and tactile perception is evaluated. The effect of loudness on 
haptic feedback is discussed using the example of touch-based musical interaction. 


12.1 Introduction 


The usage of touch-sensitive interfaces has rapidly increased over the last 10 years, 
partially due to many successful applications for smartphones and tablets. Another 
reason is the enhanced interaction capabilities of touchscreens in comparison with 
the mouse. For example, their multi-touch capability allows the device to recognise 
more than one point of contact. Gesture-based communication can be realized easily 
using touchscreens. Additional interface elements, such as buttons, knobs, sliders, can 
be individually arranged depending on the application. These aspects make devices 
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Fig. 12.1 Digital touch instrument apps: a piano, b drum and c liveloops from the GarageBand 
(http://www.apple.com/ios/garageband/, last accessed on 25 Nov 2017) DAW, d sound objects 
(https://itunes.apple.com/us/app/sound-objects/id656640735 ?mt=8, last accessed on 25 Nov 2017) 


with touch-sensitive surfaces very interesting for music-based applications. Virtual 
musical instruments as well as audio mixing and music composition applications 
benefit strongly from this trend. There are various apps which try to simulate existing 
musical instruments or to create new music experiences (Fig. 12.1). 

Wanderley and Battier [1] described the importance of gestures and their recog- 
nition for music performance. Choi categorized gestural primitives as trajectory- 
based primitives, force-based primitives and pattern-based primitives. Several of 
these primitives can be recognized using touch-sensitive interfaces [2]. 

Several table-based interfaces for musical applications have been developed 
recently: the Reactable (Rotor!), Akustich?, Bricktable, Surface Music, Sound 
Storm? or ToCoPlay [3-6]. Most of these devices use a tangible interface where the 
player controls the system by means of real objects. Musical applications running on 
touchscreen devices such as smartphones and tablets followed this trend. However, 
not only gesture recognition but also haptic feedback plays an important role in the 
success of such kind of applications. The missing haptic feedback in touchscreen- 
based devices strongly limits the capabilities of the system. The design of musical 
applications calls for the addition of advanced haptic feedback [7, 8]. For audio 
mixing, music composition applications and musical performances, touchscreen sys- 
tems with haptic feedback are very promising. 


‘http://reactable.com/rotor/ (last accessed on 17 Nov 2017) 
*http://modin. yuri.at/tangibles/data/akustisch.mp4 (last accessed on 17 Nov 2017) 
3http://subcycle.org/ (last accessed on Nov. 25, 2017) 
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Several technical solutions have been developed for haptic feedback integration 
in touchscreen devices. Various types of low-cost and compact actuators are cur- 
rently used in consumer electronics, having different characteristics [9]. In recent 
years, electrostatic and ultrasonic technologies have been researched for use in hap- 
tic interfaces. On touchscreens using electrostatic technology, finger movements over 
the touch surface induce an electric force field due to electrostatic friction [10, 11]. 
Various systems exist based on ultrasonic technology such as mid-air (no direct 
contact with the surface) [12, 13] or touch interfaces [14-16]. The latter employ 
ultrasonic vibrations to create a squeeze film of air between the vibrating surface and 
the fingertip, thus modulating the surface’s friction. Focused ultrasound is capable 
of inducing tactile, thermal and tickling sensations [17, 18]. Both electrostatic and 
ultrasonic technologies do not use any moving parts. 

Over the last few years, the authors have conducted several investigations with 
touchscreen-based devices to understand and improve the capabilities of such kind 
of systems for musical applications [19-24]. In this chapter, various aspects of these 
investigations are summarized, extended and discussed. Particularly, musical inter- 
actions with touchscreens require to consider both auditory and haptic perception. In 
most cases, the haptic feedback is generated by means of the audio signal; therefore, 
the interaction of both is an important issue. This chapter aims to illustrate some 
fundamental aspects of haptic and audio feedback for touchscreen-based musical 
applications and introduce the benefits of audio—tactile interaction. 


12.2 Perceptual Aspects of Auditory and Haptic Modalities 
for Musical Touchscreen Applications 


Playing a musical instrument is a complex task, and optimized multisensory stimuli 
may be useful, e.g. supporting spatial and temporal accuracy. Sound and vibration 
are physically coupled while playing a musical instrument or listening to music live 
or through loudspeakers. The knowledge of auditory and haptic psychophysics is 
necessary for the designer of multimodal interfaces to develop high-quality devices. 
In this section, perception of intensity, frequency and temporal aspects is discussed 
with respect to their importance to musical applications. 


12.2.1 Intensity 


Dynamic ranges of the auditory and tactile perceptions differ greatly. Although the 
perceivable dynamic range for hearing is approximately 130 dB, tactile perception 
can only discriminate a dynamic range of 50 dB. The just-noticeable differences 
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Fig. 12.2 Growth of perceived magnitude as a function of sensation level for acoustical and vibra- 
tory stimuli at 250 Hz [19, 21, 22] 


(JNDs) in level for both modalities are about 1 dB. In music applications, such 
dynamic range differences should be taken especially into account, especially if 
haptic feedback is produced using audio signals: The perceived vibration magnitude 
might rise rapidly from imperceptible to strong if vibrations are generated from 
audio signal with wide dynamic range. Therefore, it might be advantageous to apply 
dynamic compression [21]. 

Intensity perception across the two modalities shows different behaviours. At 
1 kHz, an increase of 10 dB in sound pressure level causes a sense of doubling 
in perceived loudness. At 250 Hz, an increase of 4—8 dB in vibration level causes a 
sense of doubling in perceived vibration intensity. In Fig. 12.2, the perceived intensity 
growth functions of auditory and tactile modalities are compared at same frequency 
(250 Hz): The rate of growth for the tactile modality is higher than for the auditory 
modality. 


12.2.2 Frequency 


In most musical applications, the frequency spectra of auditory and vibrotactile cues 
are coupled to each other by physical laws. Such frequency coupling plays an impor- 
tant role in how humans integrate auditory and tactile information [19]. 

Sounds that are audible to the human ear fall in the frequency range of about 
20-20,000 Hz, with highest sensitivity between 500 and 4000 Hz. Just-noticeable 
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frequency differences (JNFDs) for the auditory system were reported by Zwicker and 
Fastl [25]. They investigated that, at frequencies below 500 Hz, humans are able to 
differentiate between two tone bursts with a frequency difference of only about 1 Hz, 
and this value increases with frequency. Above 500 Hz, the JNFD is approximately 
0.002 times the frequency. 

The frequency range of auditory perception is much wider than that of tactile 
perception: The skin is sensitive to frequencies between 1 and 1000 Hz, with highest 
sensitivity in the range of 200-300 Hz. JNFDs for sinusoidal vibrations and tac- 
tile pulses on the finger and volar forearm were measured by different researchers 
[25-27]. The values for the Weber fraction (difference threshold divided by stimulus 
intensity) range from 0.07 to 0.2. Frequency discrimination of the tactile channel is 
fairly good at low frequencies but deteriorates rapidly as frequency increases [25]. 

Overall, these results indicate that the skin is rather poor at discriminating fre- 
quency in comparison with the ear. 


12.2.3 Temporal Acuity and Rhythm Perception 


Conversely, the auditory modality shows an extraordinary temporal resolution. As an 
example, two impulses will be perceived as separate sounds if there is only 1-2 ms 
gap between them. Although the temporal acuity of the cutaneous system is not as 
high as that of the auditory system, still individuals can distinguish 8-10 ms gap 
between two tactile sinusoidal bursts [28, 29]. Anyhow, in comparison with vision, 
both audition and vibrotaction have very high temporal resolution. 

Apart from temporal acuity, the perception of rhythm is an important capability 
of both modalities. In all cultures, it is common that people tap or move their hand, 
foot or other body parts in synchrony with music [30]. The processing of such metric 
information is only possible through the auditory and tactile/somatosensory channels, 
but not by means of vision. A research study by Brochard and colleagues shows 
that humans can abstract the metric structure from tactile rhythmic sequences as 
efficiently as from equivalent auditory patterns [31]. This ability is independent from 
the musical expertise. Various scientists assume that early developing relationship 
between the auditory modality and movement-related sensory inputs is maintained 
in adulthood [32]. The results of Bresciani et al. [33] show that the visual modality 
alone plays a minor role in feeling the contact with objects, at least when tactile and 
auditory modalities are available. 


12.2.4 Synchrony 


Temporal correlation is an important cue for the brain to integrate multiple sensory 
inputs generated by a single event, as well as to differentiate inputs related to sep- 
arate events occurring at the same time. However, the synchronization of different 
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modalities in multimedia applications is a major issue, due to technical constraints 
such as data transfer time, computer processing time and delays that occur dur- 
ing feedback generation processes. As the asynchrony between different modalities 
increases, the sense of presence and realism of multimedia applications decrease. 

Several results are available on audio-tactile asynchrony perception [34, 35], indi- 
cating that, in order to preserve a unitary percept, the temporal discrepancy between 
the auditory and tactile modalities must be within 25 ms for various multimedia 
systems. However, for the purpose of the discussion in this chapter, it is necessary 
to consider the literature focusing on touchscreens. Kaaresoja has measured the tol- 
erable multimodal latency in mobile touchscreen virtual button interaction, showing 
that tactile feedback latency should not exceed 25 ms and audio feedback latency 
should not exceed 100 ms [36]. Unfortunately, most of the current mobile phones or 
tablets cannot fulfil these latency figures. Such latency issues have a negative effect 
on the quality of musical interaction. Therefore, the progress of multimodal technol- 
ogy with respect to synchrony and latency will play an important role for the success 
of musical touchscreen applications. 


12.3 Experiment 1: Identification of Audio-Driven Tactile 
Feedback on a Touchscreen 


Grooveboxes can be considered as a combination of a control surface, a sampler, a 
music sequencer and a drum computer. They are popularly used for the production of 
various kinds of loop-based music styles, such as electro, techno, hip hop, especially 
in live concerts. Touchscreen-based grooveboxes may enable the user to redefine the 
combination, organization and size of the knobs, sliders, buttons [20]. In groovebox 
applications, the possibility to identify and discriminate the available musical loops 
is crucial to the user. A series of four experiments (referred to as la—d) were set up, 
whereby tactile feedback was generated from audio signals based on four different 
approaches. Tactile signal parameters were systematically varied according to the 
perceptual characteristics discussed in Sect. 12.2. The objective was to test which 
tactile feedback processing strategies helped distinguish audio loops. Furthermore, 
the attractiveness of the system, including pragmatic and hedonic qualities, was 
evaluated. 


12.3.1 Stimuli 


The main discriminant acoustic features of musical instruments are the frequency and 
amplitude structure, and temporal envelope of the produced tones. Most percussive 
instruments are unpitched (e.g. the snare), while others excite auditory pitch percep- 
tion (e.g. the kettledrum). Features such as melody, rhythm and dynamics must be 
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processed to some extent to generate a suitable vibrotactile signal from the acoustical 
signal. To this end, various strategies have been applied in the experiments reported 
in this chapter, similar to what is described in Sect. 7.3. 

The simplest way to generate tactile feedback from acoustic signals is by low- 
pass filtering, as done in experiments la and 1d with cut-off frequency set to 1 kHz. 
As discussed already, auditory and tactile signals have strong similarities in the 
frequency domain. However, the tactile system is not sensitive to frequencies above 
1 kHz. 

Experiment 1b investigated the use of a frequency-shift strategy to generate vibro- 
tactile feedback from the original audio signal. Assuming that good integration 
between auditory and tactile information occurs when the acoustical frequency is 
a harmonic of the vibration frequency, the spectrum of the audio signal was shifted 
down one octave by means of granular synthesis technique. While this allowed to 
preserve accurate timing, the processing resulted in some unwanted artefacts. How- 
ever, such artefacts are produced especially at higher frequencies, mostly above the 
range of tactile perception (see Sect. 4.2). 

In experiment 1c, beat information was extracted from audio loops looking for fast 
attack transients in the amplitude envelope. The detected beats triggered sinusoidal 
pulses at 100 Hz and lasting 80 ms, that is easily perceived. 


12.3.2 Set-up 


An Apple iPod Touch? was used as touch-sensitive input device, while tactile feed- 
back was delivered by an electrodynamic exciter (Monacor BR-25) mounted at the 
back of the iPod (see Fig. 12.3). Its touchscreen surface was divided into six virtual 
buttons, each of which corresponded to a specific audio loop. When the participant 
pressed a button, tactile feedback for the respective channel was rendered in real time 
using Pure Data, while the audio signals were reproduced by closed-back reference 
headphones (Sennheiser HDA 200). The headphones offer effective sound isolation 
and therefore masked the background noise generated by the tactile system. The task 
was to associate each vibrating button to the corresponding audio signal. 


12.3.3 Subjects 


Twenty subjects, sixteen male and four female, aged between 20 and 40 years, partic- 
ipated in the experiment. They had no knowledge of acoustics, and they voluntarily 
participated in this study. All subjects were right-handed and had self-reported nor- 
mal hearing. 


4https://en.wikipedia.org/wiki/IPod_Touch (last accessed on 15 Nov 2017). 
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WiFi 


Fig. 12.3 Touchscreen device was mounted on an electrodynamic shaker for vibration reproduction 


12.3.4 Results and Discussion 


In this section, the results of the identification investigations for different signal 
processing strategies are summarized. 


12.3.4.1 Low-Pass Filtering 


In experiment la, the six vibrotactile stimuli were generated by low-pass filtering 
the audio loops at 1 kHz. 

The percentage of correct responses for the stimuli are shown in Fig. 12.4a. Sub- 
jects could correctly identify most of the instruments. Errors are particularly low 
for percussion instruments which generate mainly higher frequencies, such as the 
snare, hi-hat or tambourine: The percentage of correct responses for snare, hi-hat and 
tambourine is higher than 80%. The participants reported that temporal envelope and 
frequency content were important cues. 


12.3.4.2 Pitch Shifting 


In experiment 1b, the vibration signals were generated by shifting down by one 
octave the spectra of the audio loops. The resulting signals were low-pass filtered at 
1 kHz to get rid of high-frequency artefacts due to the processing. 

The percentage of correct responses for the six stimuli are shown in Fig. 12.4b. 
Compared to simple low-pass filtering, octave shifting improved the identification 
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Fig. 12.4 Results of the identification experiment for different percussive instruments (audio 
loops). The vibration signals were generated by processing the audio signal via a low-pass fil- 
tering with cut-off at 1 kHz and b pitch shifting one octave down 


of the loops. Indeed, pitch shifting allowed to perceive important components of the 
original sounds through the tactile sense. For instance, the attack of the kick drum 
presents relevant content at frequencies above 1 kHz. The kick drum and shaker 
could be better identified than in the low-pass filtering condition, but there were 
slightly more errors between the hi-hat and snare, perhaps because the hi-hat was 
perceived more intense than before as its dominant high-frequency energy was shifted 
towards lower frequencies. However, it is unclear whether features of the sequence 
(e.g. rhythm) or features of the source (e.g. frequency content) or both influenced 
the results; therefore, experiments 1c and 1d focused on separating the sequence and 
source features. 


12.3.4.3 Beat Detection 


In experiment 1c, the individual loops were analysed and their beat was detected, 
which in turn triggered artificial vibration signals. Thus, source features such as 
frequency content were not conveyed from the vibration signal, while the original 
rhythmic sequence was preserved. 

Results are shown in Fig. 12.5a. While rhythm is an important factor for loop 
identification, the overall detection rate decreased. This showed that other features 
of musical signals play an important role. 
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Fig. 12.5 Identification results for different instruments. The vibration signals were generated 
using a sequence features (beat detection and signal substitution) and b source features (low-passed 
percussive hits) 


12.3.4.4 Single Hits 


In experiment 1d, rhythm (sequence) information was removed to test whether a 
percussion instrument could be identified with only source features; thus, only a 
single hit was reproduced. Accordingly, the bass line and tambourine loops were 
removed from the stimuli set, and other percussion sounds (guiro and handclap) 
with distinct source features were added. The vibration signals were generated by 
low-pass filtering single hits at 1 kHz. 

As seen in Fig. 12.5b, the kick drum and snare were identified with 100% accu- 
racy, possibly due to their characteristic frequency content, which resulted in clearly 
distinct tactile perceptual qualities. Of the remaining instruments, the guiro had 
the highest number of correct identifications, perhaps because of its typical time 
structure (rattle like) that distinguishes it from the instruments with different time 
structures (bang like). The high-frequency percussive sounds were not differentiated 
well. Subsequent experiments revealed that the detection rate did not improve with 
octave shifting the single hits, or by adding a preliminary training phase. 


12.3.4.5 Summary 


The best identification rates were obtained when the source and sequence features 
were preserved (low-pass filtered or octave-shifted signals). Identification relying on 
rhythm information (beat detection) was observed to be time consuming and varied 
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largely between subjects: The average identification time was approximately 10 s 
per loop in experiment 1c, while only 6 s were needed in experiments la and 1b and 
8 s in the case of Id. 


12.3.5 Usability and Attractiveness 


Before and after the experiments reported above, participants were asked to mix the 
six audio loops into a 90 s composition using the set-up described in Sect. 12.3.2. 
Instead of buttons, six faders were used to blend the different audio signals. In 
the first set, a conventional groovebox without tactile feedback was simulated. In 
the second set, audio-driven tactile feedback was rendered using the octave shift 
approach described in Sect. 12.3.4.2. When the finger of the user came in contact 
with a fader, vibration for the respective channel was rendered. 

After completion, participants were asked to judge the usability and attractiveness 
of the groovebox using the AttrakDiff [37] semantic differential. This method uses 
pairs of bipolar adjectives to evaluate the pragmatic and hedonic qualities of inter- 
active products. The adjectives, grouped under four categories, and relative across- 
participants mean semantic ratings are reported in Fig. 12.6. The pragmatic quality 
is on average better without tactile feedback; this was likely due to participants 
experiencing some difficulty with audio—tactile association in the prior experiments. 
The individual ratings for the tactile feedback set-up varied, indicating disagreement 
between subjects. However, the difference in pragmatic quality is not statistically 
significant (dependent t test for paired samples, p > 0.05). On average, the hedonic 
quality was better with tactile feedback, especially for the “stimulation” aspect (p < 
0.05). The hedonic category “stimulation” refers to the ability of a product to support 
the user to further personal development. The groovebox with audio-driven tactile 
feedback was rated as more innovative, captivating and challenging. These results are 
in agreement with other studies that evaluated multimodal feedback [38]. The over- 
all attractiveness of the groovebox remains the same with or without audio-driven 
tactile feedback. This result is reasonable if the attractiveness is understood based 
on the hedonic and pragmatic qualities, where each contributes in equal parts to the 
attractiveness of a product [35]. 

Obviously, the presented data are only valid for the specific exercise and the labo- 
ratory conditions described above, while results might change depending on task and 
context. For example, in a real live set it might be more important to know if a finger 
is on the correct fader; tactile feedback might also help DJs match beats between 
different tracks, influencing their pragmatic quality perception. Thus, conclusions 
should be drawn carefully. 

In most touchscreen-based consumer devices, such as mobile phones and tablets, 
smaller low-fidelity actuators are used instead of the electrodynamic exciter that was 
used in the described experiments. Small actuators have several limitations in terms 
of the achievable vibration intensity and frequency range. Additionally, they have a 
slow temporal response time in comparison with other technologies, such as voice 
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Fig. 12.6 Mean values of the AttrakDiff semantic differential for seven items on each of the four 
dimensions: pragmatic quality, hedonic quality—identity, hedonic quality—stimulation and attrac- 
tiveness 


coil or piezoelectric actuators (see Sect. 13.2 for a review of actuator technology). To 
overcome such limitations, multimodal interaction can be very promising as it can 
compensate what is lacking in one modality with higher fidelity in another channel. 
In this perspective, a further experiment was conducted to investigate crossmodal 
intensity interaction between the auditory and tactile channels. 


12.4 Experiment 2: Effect of Loudness on Perceived Tactile 
Intensity of Virtual Buttons 


For several conventional or digital musical instruments, one fundamental interaction 
is that of pressing a button or a key [39]. Also, interaction with the user interface 
of DMIs (e.g. a groovebox) or mixing consoles is often mediated by buttons. This 
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experiment aims to investigate the effect of loudness on the perceived intensity of 
tactile feedback provided by a touchscreen. 


12.4.1 Stimuli 


An impulsive waveform was selected as tactile signal, which represents the feedback 
produced by a conventional button. The stimuli amplitude corresponds to the per- 
pendicular displacement of the surface, and positive values mean movement towards 
the subject. In order to be compatible with the characteristics of small actuators, 
a relatively small amplitude was selected. The maximum amplitude of the stimuli, 
which occurs at the beginning of the interaction, is 20 um. The amplitude of the 
impulse then decays exponentially in 100 ms. As audio signal, a 400 Hz decaying 
sinusoid lasting also 100 ms was selected. The initial and maximum sound pressure 
level could be set at 50, 60 or 70 dB. Again, an exponential decay was applied. 


12.4.2 Set-up 


The experiment made use of the same hardware set-up as in experiment 1 (see Sect. 
12.3.2). In this case, the surface of the touchscreen was divided into two virtual 
buttons. 


12.4.3 Subjects 


Eighteen subjects, twelve male and six female, aged between 20 and 35 years, par- 
ticipated in this experiment. The subjects had no any acoustic knowledge, and they 
voluntarily participated in this study. All subjects were right-handed and had self- 
reported normal hearing. 


12.4.4 Procedure 


The task was to estimate the intensity of the feedback delivered by the virtual button. 
Participants were instructed to concentrate only on the tactile feedback. The magni- 
tude estimation method with anchor stimulus was used [40]. After the tactile-only 
anchor stimulus, a test stimulus was presented and participants had to assign a num- 
ber proportional to their subjective impression of the stimulus intensity relative to 
the anchor stimulus, assuming that the intensity of the latter corresponded to 100. 
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When participants did not perceive the test stimulus, they had to assign 0. Each 
stimulus pairs were presented ten times in random order. 


12.4.5 Results and Discussion 


Figure 12.7 shows the responses of all subjects. Geometric mean values were com- 
puted for the magnitude estimates obtained from all subjects for each stimulus con- 
dition. 

All audio-tactile conditions produced higher estimates than the only-tactile con- 
dition. Dependent t tests of the means showed that three conditions (only tactile, 
audio-tactile 50 dB and audio-tactile 70 dB) differed significantly (p < 0.05). 

The results show that if a tactile button feedback is combined with audio feedback, 
the perceived intensity of the tactile feedback increases. When the tactile stimulus 
was accompanied by the acoustic stimulus, the tactile intensity was perceived on 
average between 56 and 96% higher. 

The perceived tactile intensity magnitude increased for increasing sound levels, in 
spite of no change in the actual tactile feedback level. Similarly, in a previous inves- 
tigation the authors found that, for a virtual drum, the magnitude of force feedback 
strength increased with increasing loudness, in spite of no change in force feedback 
[19]. 

Overall, these results indicate that auditory information can be useful in overcom- 
ing the current limitations of haptic devices. 


12.5 Conclusions 


In this chapter, first the fundamental perceptual aspects of auditory and tactile per- 
ception were discussed focusing on musical touchscreen applications. Based on this 
knowledge, various audio-tactile signal generation techniques were introduced and 
evaluated. 
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In a first series of experiments, it was found that percussive instruments can be 
identified to some degree if audio-driven tactile feedback is rendered. The detection 
rate was best when source characteristics and rhythmic features were maintained 
while translating from audio to tactile signals. A qualitative study showed that tactile 
feedback can improve the quality of touchscreen-based music interfaces and make 
them more attractive for the users. 

A second investigation based on the same set-up focused on the perceived tactile 
feedback intensity of virtual buttons, showing that this can be significantly influ- 
enced by parallel auditory. This result may be used to compensate for the limitations 
of current small actuator technology as found in consumer devices. The coupled 
perception of sound and vibration is important for the implementation of innovative 
touch-based musical interaction, and tactile feedback is useful to enrich the musical 
interaction. 
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Chapter 13 A) 
Implementation and Characterization chean; 
of Vibrotactile Interfaces 


Stefano Papetti, Martin Fröhlich, Federico Fontana, 
Sébastien Schiesser and Federico Avanzini 


Abstract While a standard approach is more or less established for rendering basic 
vibratory cues in consumer electronics, the implementation of advanced vibrotac- 
tile feedback still requires designers and engineers to solve a number of technical 
issues. Several off-the-shelf vibration actuators are currently available, having dif- 
ferent characteristics and limitations that should be considered in the design process. 
We suggest an iterative approach to design in which vibrotactile interfaces are val- 
idated by testing their accuracy in rendering vibratory cues and in measuring input 
gestures. Several examples of prototype interfaces yielding audio-haptic feedback 
are described, ranging from open-ended devices to musical interfaces, addressing 
their design and the characterization of their vibratory output. 


13.1 Introduction 


The use of cutaneous feedback, in place of a full-featured haptic experience, has 
recently received increased attention in the haptics community [5, 31], both at 
research level and industrial level. Indeed, enabling vibration in consumer 


S. Papetti (È<) - M. Fröhlich - S. Schiesser 

ICST—Institute for Computer Music and Sound Technology, 

Zürcher Hochschule der Künste, Pfingsweidstrasse 96, 8005 Zurich, Switzerland 
e-mail: stefano.papetti@zhdk.ch 


M. Frohlich 
e-mail: martin.froehlich @zhdk.ch 


S. Schiesser 
e-mail: sebastien.schiesser @ zhdk.ch 


F. Fontana 

Dipartimento di Scienze Matematiche, Informatiche e Fisiche, 
Universita di Udine, via delle Scienze 206, 33100 Udine, Italy 
e-mail: federico.fontana@ uniud.it 


F. Avanzini 

Dipartimento di Informatica, Universita di Milano, 
Via Comelico 39, 20135 Milano, Italy 

e-mail: federico.avanzini @ di.unimi.it 


© The Author(s) 2018 257 
S. Papetti and C. Saitis (eds.), Musical Haptics, Springer Series on Touch 
and Haptic Systems, https://doi.org/10.1007/978-3-3 19-583 16-7_13 


258 S. Papetti et al. 


devices—especially portable ones—is far more practical than providing motion and 
force feedback to the user, which would generally result in bulky and mechanically 
complex implementations requiring powerful motors. Recently, several studies have 
been conducted on the use of vibratory cues as a sensory substitution method to 
convey pseudo-haptic effects, e.g., to simulate textures [2, 26], moving objects [43], 
forces [14, 25, 29, 35], or alter the perceived nature and compliance of materials [30, 
32, 41]. Other studies exist that assessed intuitiveness of vibrotactile feedback with 
untrained subjects [21] and how it may improve user performance after training [38]. 

Among the approaches adopted to design vibrotactile feedback for non-visual 
information display, complex semantics have been investigated [20] on top of simpler 
vibrotactile codes [3, 22]. Focusing in particular on DMIs, the most straightforward 
solution is to obtain tactile signals directly from their audio output. In practice, this 
may be done either by rendering to the skin the vibratory by-products generated by 
embedded loudspeakers—for instance, this may occur as a side effect while play- 
ing some inexpensive digital pianos for home practicing—or, using a slightly more 
sophisticated technique, by feeding dedicated vibrotactile actuators with the same 
signals used for auditory feedback [12]. In spite of the minimal design effort, these 
approaches have the potential to result in a credible multimodal experience. Sound 
and vibration are in fact tightly coupled phenomena, as sound is the acoustic mani- 
festation of a vibratory process. However, these simple solutions overlook a number 
of spurious and unwanted issues such as odd coupling between the electroacoustic 
equipment and the rest of the instrument, and unpredictable nonlinearities in the 
vibrotactile response of the setup [10]. A more careful design should be adopted 
instead, in which vibrotactile signals are tailored to match human vibrotactile sen- 
sitivity (see Sect.4.2) and adapted to the chosen actuator technology. In musical 
interfaces, this can be generally done by equalizing the original audio signal with 
respect to both its overall energy and frequency content, as discussed in more detail 
in Sect. 13.3 of this chapter. 

To make sure that newly developed musical haptic devices actually render feed- 
back as designed, we suggest that they should undergo characterization and validation 
procedures. The literature of touch psychophysics shows that divergent results are 
possible, due to the varying accuracy of haptic devices [23, 36]. As an example, when 
studying vibrotactile sensitivity the characterization of vibratory output would allow 
experimenters to compare the stimuli actually delivered to the skin with the original 
stimuli fed in the experimental device. Notably, a similar practice is routinely imple- 
mented in psychoacoustic studies where, e.g., the actual sound intensity reaching the 
participants’ ears is usually measured and reported together with other experimental 
data. Particular attention should also be devoted to analyzing the mechanical coupling 
between a vibrotactile interface and the skin, as that is ultimately how vibratory stim- 
uli are conveyed [27]. However, as discussed in Sect. 4.1, this may turn out especially 
difficult when targeting everyday interaction involving active touch, as opposed to 
controlled passive settings that are only possible in a laboratory. Once character- 
istics have been measured, they may guide the iterative design and refinement of 
haptic interfaces and may offer experimenters a more insightful interpretation of 
experimental results. 
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In what follows, we first discuss readily available technology that is suitable for 
implementing vibrotactile feedback in musical interfaces and then describe the design 
and characterization of a few exemplary devices that were recently developed by the 
authors for various purposes. 


13.2 Vibrotactile Actuators’ Technology 


When selecting vibrotactile actuators, designers and engineers need to consider fac- 
tors such as cost, size, shape, power and driving requirements, frequency, temporal, 
and amplitude response [5]. For rendering effective tactile feedback, such responses 
should at least be compatible with results of touch psychophysics. Also, to grant ver- 
satility in the design of vibrotactile cues, actuators’ frequency response and dynamic 
range should be as wide as possible, and their onset/stop time negligible. For exam- 
ple, while it is known that piano mechanics results in variable delay between action 
and audio-tactile feedback [1], to have full control over this aspect while designing 
keyboard-based DMIs, audio and tactile devices should offer the lowest possible 
latency [7, 17]. 

Among the currently available types of actuators suitable to convey vibrotac- 
tile stimuli, the more common ones are as follows: eccentric rotating mass (ERM) 
actuator, voice coil actuator (VCA), and piezoelectric actuator [5, 24]. 

ERM actuators make use of a direct current (DC) motor, which spins an eccentric 
rotating mass. They come in various designs with different form factors, ranging from 
cylinders to flat ‘pancakes.’ This technology has two main downsides: The first one 
is that vibration frequency and amplitude are interdependent, as the rotational speed 
(frequency), which is proportional to the applied voltage, is also proportional to the 
generated vibration amplitude; the second one is that, mainly due to its inertia, the 
rotating mass requires some time to reach a target speed. Overall, these issues make 
ERM unsuitable to reproduce audio-like signals that have rich frequency content and 
fast transients. Despite these limitations, thanks to their simple implementation ERM 
actuators have been commonly used in consumer electronics such as mobile phones 
and game devices. 

VCAs are driven by alternate current (AC) and consist of an electrically con- 
ductive coil (usually made of copper) interacting with a permanent magnet. Two 
main VCA types are available, either using a moving coil or using a moving, sus- 
pended magnet. The functioning principle of moving coil VCAs is similar to that 
of the loudspeaker, except that, instead of a membrane producing sound pressure 
waves, there is a moving mass generating vibrations. Moving coil VCAs are gen- 
erally designed to move small masses, and since their output energy in the lower 
frequency range is constrained by the size of the moving mass, they cannot produce 
substantial low-frequency vibration. Conversely, moving magnet VCAs are of greater 
interest for vibrotactile applications as they can generally provide higher energy in 
the lower frequency band. However, to keep them compact and light, a smaller mov- 
ing mass must be compensated by a larger peak-to-peak excursion, complicating the 
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suspension design [44]. Linear resonating actuators (LRAs) are particular voice coil 
designs that use a moving magnetic mass attached to a spring. They are meant to 
produce fixed frequency vibration at the resonating frequency of the spring—mass 
system, and therefore, they are highly power-efficient. Because of their increased 
power efficiency and compactness compared to ERM actuators, LRAs are becoming 
the preferred choice for use in consumer electronics, at the cost of higher complexity 
of the driving circuit. Generally though, VCAs offer wide band frequency operation 
and quick response times, making them suitable for audio-like input signals, with 
complex frequency content and fast transients. 

Piezoelectric materials deform proportionally to an applied electric field, or con- 
versely develop an electric charge proportional to the applied mechanical stress. For 
this reason, they can be used both as sensors and actuators. In the latter case, they 
may be driven either by DC or by AC current. Since piezoelectric actuators have 
no moving parts and no friction is produced, they present minimal aging effects and 
are generally regarded as highly robust. Variations of size, form, and cost/quality 
factors are available, ranging from ultra-cheap thin piezo disks to high-performance 
devices made of stacked piezoelectric elements (e.g., used for precision positioning). 
Piezo actuators have extremely fast response times, and their frequency range can be 
very wide (although not particularly in the lower band), so they may be used, e.g., 
as extremely compact loudspeakers or to generate ultrasounds. Since they do not 
generate magnetic fields while operating, they are suitable when space is tight and 
insulation from other electronic components is not possible. On the downside, while 
their current consumption is low (similar to LRAs), compared to VCAs and ERM 
they require higher voltage input to operate, up to a few hundreds Volt. Therefore, 
they usually need special driving electronics to be used with audio signals. 

Several solutions are available for controlling the above types of actuators, both in 
the form of hardware and software. Hardware solutions are typically driving circuits 
used to condition input signals to conform with target actuator specifications,! while 
software solutions include libraries of pre-recorded optimized input signals to achieve 
different effects in interactive applications.” 


13.3 Interface Examples 


13.3.1 The Touch-Box 


The Touch-Box is an interface originally developed for conducting experiments on 
human performance and psychophysics under vibrotactile feedback conditions. The 
device, shown in Fig. 13.1, measures normal forces applied to its top panel, which 
provides vibrotactile feedback. An early prototype was used to study how auditory, 
tactile, and audio-tactile feedback affect the accuracy of finger pressing force [18]. A 


'See, for instance, www.ti.com/haptics (last accessed on Nov 29, 2017). 


For example, see Immersion TouchSense technology: www.immersion.com (last accessed on 
Nov 29, 2017). 
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Fig. 13.1 The Touch-Box 
interface. Figure reprinted 
from [33] 


more recent psychophysical experiment—described in Sect. 4.2 and making use of a 
more advanced prototype, described below—investigated how vibrotactile sensitivity 
is influenced by actively applied finger pressing forces of various intensities. 


13.3.1.1 Implementation 


For the latter experiment, a high-fidelity version of the Touch-Box was developed. 
Load cell technology was selected for force sensing, thanks to superior reliability 
and reproducibility of results: A CZL635 load cell was chosen, capable of measuring 
forces up to 49 N. For vibrotactile feedback, a Tactile Labs Haptuator mark II? was 
used: a VCA with moving magnet suitable to render vibration up to 1000 Hz. An 
Arduino UNO computing platform‘ receives the analog force signal from the load 
cell and samples it uniformly at 1920 Hz with 10-bit resolution [6]. The board is 
connected via USB to ad hoc software developed in the Pure Data environment and 
run on a host computer. The software receives force data and uses them to synthesize 
vibrotactile signals in return. These are routed as audio signals through a RME 
Fireface 800 audio interface’ feeding an audio amplifier connected to the actuator. 
The device measures the area of contact of a finger touching its top surface. Similar 
to the technological solution described in [42], a strip of infrared LEDs was attached 
at one side of the top panel, which is made of transparent Plexiglas: In this way, a 
finger pad touching the surface is illuminated by the infrared light passing through 
it. A miniature infrared camera placed under the top panel captures high-resolution 
(1280 x 960 pixels) images at 30 fps and sends them via USB to a video processing 


3http://tactilelabs.com/products/haptics/haptuator-mark-ii-v2/ (last accessed on Dec. 21, 2017). 
4https:// store.arduino.cc/usa/arduino-uno-rev3 (last accessed on Dec. 21, 2017). 
Shttp://www.rme-audio.de/en/products/fireface_800.php (last accessed on Dec. 21, 2017). 
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software developed in the Max/MSP/Jitter environment, where finger contact area is 
estimated. 

The mechanical construction of the interface was iteratively refined, so as to opti- 
mize the response of the force sensor and vibrotactile actuator. For instance, since the 
moving magnet of the Haptuator moves along its longitudinal direction, the actuator 
was suspended and mounted perpendicularly at the lower side of the Touch-Box top 
panel, thus maximizing the amount of energy conveyed to it. Special care was devoted 
to forbid coupling of the Haptuator with the rest of the structure, which could gener- 
ate spurious resonances and dissipate energy. Various weight and thickness values of 
the Plexiglas panel were also tested, with the purpose of minimizing nonlinearities 
in the produced vibration, while keeping the equivalent mass of a finger pressing on 
top of the panel compatible with the vibratory power generated by our system. 


13.3.1.2 Characterization of Force Measurement 


The offset load on the force sensor due to the device construction was first measured 
and subtracted for subsequent processing. Force acquisition was characterized by 
performing measurements with a set of test weights from 50 to 5000 g resulting in a 
pseudo-linear curve which maps digital data readings from the Arduino board (10-bit 
values) to the corresponding force values in Newtons. The obtained map was used 
in the Pure Data software to read force data. 


13.3.1.3 Characterization of Contact Area Measurement 


Finger contact area is obtained from the data recorded by the infrared camera. 
Acquired images are processed in real time to extract the contour of the finger pad 
portion in contact with the panel and to count the number of contained pixels. 

The area corresponding to a single pixel (i.e., the resolution of the area mea- 
surement system) was calibrated by applying a set of laser-cut adhesive patches of 
predefined sizes on the top panel. Test weights of 200, 800, and 1500 g were used 
to simulate the pressing forces used in the experiment described in Sect. 4.2, which 
result in slightly different distances of the top panel from the camera, influencing its 
magnification ratio. The measurements were averaged for each pressing force level, 
obtaining the following pixel size values: 0.001161 mm? (200 g), 0.001125 mm? 
(800 g), and 0.001098 mm? (1500 g). 

Finger contact areas in mm? were finally obtained by multiplying the counted 
number of pixels by the appropriate pixel size value, depending on the applied force. 


13.3.1.4 Characterization of Vibration Output 


The accuracy of the device in reproducing a given vibrotactile signal was tested. The 
test signals were those used in the mentioned experiment: a sine wave at 250 Hz, and 
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a white noise band-pass filtered with 48 dB/octave cutoffs at 50 and 500 Hz. Vibration 
measurements were carried out with a Wilcoxon 736T piezoelectric accelerometer® 
(sensitivity = 10.2 mV /m/s?, +5%, 25 °C) with frequency response flat +5% in the 
532200 Hz range) connected to a Wilcoxon iT 111M transmitter.’ The accelerometer 
was secured to the top of the Touch-Box with double adhesive tape. The AC-coupled 
output of the transmitter was recorded via a RME Fireface 800 interface as audio 
signals at 48 kHz with 24-bit resolution. 

Vibrations produced by the Touch-Box were recorded at different amplitudes 
in 2 dB steps, in the range used in the reference experiment. Measurements were 
repeated by placing 200, 800 and 1500 g test weights on top of the device, accounting 
for the pressing forces used in the experiment. 

The following calculations were performed on the recorded vibration signals to 
extract acceleration values: (i) Digital values in the range [—1, 1] were translated to 
a dBrs representation; (ii) voltage values in Volt were obtained from dBrs values, 
based on the nominal input sensitivity of the audio interface (+19dBu @ 0 dBps, 
reference 0.775 V); (iii) acceleration values in m/ s? were calculated from Volt val- 
ues, based on the nominal sensitivity of the accelerometer. Finally, RMS acceleration 
values in dB (re 10~° m/s”) were computed over an observation interval of 8 seconds 
to minimize the contribution of unwanted external noise. Notice that the considered 
vibration signals are periodic or stationary. 


Amplitude Response 


The curves in Fig. 13.2a, b relate the relative amplitudes of the stimuli to the cor- 
responding actual vibration energy produced by the Touch-Box, expressed as RMS 
acceleration. Vibration acceleration was measured in the range from the initial ampli- 
tude used in the reference experiment down to —6dB below the minimum average 
vibrotactile threshold found. Generally, vibration amplitude varied consistently with 
that of the input signal, resulting in a pseudo-linear relationship. However, the three 
weights resulted in different amplitude offsets, due to mechanical dampening. In 
the analysis of experimental data, this characterization was used for mapping the 
experimental results to actual RMS vibration acceleration values, in this way com- 
pensating for the dampening effect of pressing forces on vibration amplitude. As 
shown in Table 13.la, the effective step size of amplitude variation for the three 
weights is consistent across the considered range. 


Shitps://buy.wilcoxon.com/736t.html (last accessed on Dec. 21, 2017). 
Thttps://buy.wilcoxon.com/it100-200m.html (last accessed on Dec. 21, 2017). 


264 S. Papetti et al. 
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Table 13.1 Mean and standard deviation (in brackets) of (a) RMS acceleration amplitude variation 
(original step size 2 dB), and (b) offsets relative to amplitudes measured for the 200 g weight. Table 
reprinted from [33] (Appendix) 


Weight (g) Sinusoidal vibration (dB) Noise vibration (dB) 
(a) 

200 1.98 (0.06) 1.79 (0.33) 

800 1.99 (0.11) 2.01 (0.32) 

1500 1.95 (0.13) 1.95 (0.19) 
(b) 

800 —8.76 (0.09) —8.61 (1.13) 

1500 —10.65 (0.21) —6.95 (0.65) 
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Table 13.1b shows amplitude offsets for the 800 and 1500 g weights, relative to 
the measured amplitudes for the 200 g weight. Overall, the performed characteriza- 
tion shows that the device behaves consistently with regard to amplitude and energy 
response, with slightly higher accuracy when sinusoidal vibration is used. 


Frequency Response 


Fig. 13.3 shows the measured magnitude spectra of noise stimuli, for three sample 
amplitudes ranging from the initial level used in the experiment down to —6dB 
below the minimum average threshold found. In addition to the dampening effect 
on RMS vibration amplitudes noted above—which is the only effect measured 
in the sinusoidal condition—in the case of the noise stimulus, the three weights 
resulted in spectral structures slightly different from the original flat spectrum in the 
50-500 Hz range used as input signal. For a given weight, the spectral centroid (i.e., 
the amplitude-weighted average frequency, which roughly represents the ‘center of 
mass’ of a spectrum) of noise vibration was found to generally decrease with the sig- 
nal amplitude: For the 200 g weight, the spectral centroid varied from 188 Hz at the 
initial amplitude to 173 Hz at —6dB below the minimum average threshold found. 
For the 800 and 1500 g weights, the spectral centroid varied, respectively, from 381.3 
to 303 Hz and from 374.5 to 359.4 Hz. 

The characterization of vibrotactile feedback highlighted strengths and weak- 
nesses of the Touch-Box implementation, allowing to validate experimental results 
and to compensate for hardware limitations (namely, amplitude dampening and non- 
flat spectral response). For instance, as mentioned in Sect. 4.2.4, finding that the peak 
energy of the stimuli in the higher force condition shifted above the region of maxi- 
mum sensitivity (200-300 Hz, [39]) suggests that the vibrotactile threshold measured 
in that case was likely higher than in reality. 


13.3.2 The VibroPiano 


Historically, the reproduction of haptic properties of the piano keyboard has been first 
approached from a kinematic perspective with the aim of recreating the mechanical 
response of the keys [4, 28], also in light of experiments emphasizing the sensitivity 
of pianists to the keyboard mechanics [13]. Only recently, and in parallel to industrial 
outcomes [16], researchers started to analyze the role of the vibrotactile feedback 
component as a potential conveyor of salient cues. An early attempt by some of the 
present authors claimed possible qualitative relevance of these cues while playing 
a digital piano [12]. A few years later, a refined digital piano prototype was imple- 
mented, capable of reproducing various types of vibrotactile feedback at the key- 
board. This new prototype was used to test whether the nature of feedback can affect 
pianists’ performance and their perception of quality features (see Sect. 5.3.2.2). 
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Fig. 13.3 Acceleration 100 
magnitude spectrum (FFT 
size 32768) of the noise 
stimuli for the three test 
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Colors represent different 
amplitudes: start amplitude 
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the minimum vibrotactile 
threshold found in the 100 200 300 400 500 
experiment (magenta), and 
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13.3.2.1 Implementation 


A digital piano was used as a platform for the development of a keyboard proto- 
type yielding vibrotactile feedback. After some preliminary testing with different 
tactile actuators attached to the bottom of the original keyboard, the instrument was 
disassembled, and the keyboard detached from its metal casing and screwed to a 
thick plywood board (see Fig. 13.4). This customization improved the reproduction 
of vibrations at the keys: on the one hand by avoiding hardly controllable nonlin- 
earities arising from the metal casing, and on the other hand by conveying higher 
vibratory energy to the keys thanks to the stiffer wooden board. Two Clark Synthe- 
sis TST239 tactile transducers? were attached to the bottom of the wooden board, 
placed, respectively, in correspondence of the lower and middle octaves, in this way 


Shttp://clarksynthesis.com/ (last accessed on Dec. 21, 2017). 


13 Implementation and Characterization of Vibrotactile Interfaces 267 


Fig. 13.4 The VibroPiano setup. Figure adapted from [10] 


conveying vibrations at the most relevant areas of the keyboard [11]. Once equipped 
in this way, the keyboard was laid on a stand, interposing foam rubber at the contact 
points to minimize the formation of additional resonances. 

The transducers were driven by a high-power stereo audio amplifier set to dual 
mono configuration and fed with a monophonic signal sent by a host computer via 
a RME Fireface 800 audio interface. The audio interface received MIDI data from 
the keyboard and passed it to the computer, where sound and vibrotactile feedback 
were, respectively, generated by Modartt Pianoteq,’ a physical modeling piano whose 
audio feedback was delivered to the performer via earphones, and a software sampler 
playing back vibration samples, which were prepared beforehand as described below. 
A diagram of the setup is shown in Fig. 13.5. 


13.3.2.2 Preparation of Vibration Samples 
Recording of Piano Keyboard Vibrations 


Vibrations were recorded at the keyboard of two Yamaha Disklavier pianos—a grand 
model DC3-M4, and an upright model DUIA with control unit DKC-850—via the 
same measurement setup described in Sect. 13.3.1.4. The accelerometer was secured 
to each measured key with double-sided tape to ensure stable coupling and easy 
removal. As explained in Sect. 4.3.1, Disklavier pianos can be controlled remotely by 
sending them MIDI control data. That allowed to automate the recording of vibration 
samples by playing back MIDI ‘note ON’ messages at various MIDI velocities for 
each of the 88 actuated keys of the Disklaviers. 


°https://www.pianoteq.com/ (last accessed on Dec. 21, 2017). 
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Fig. 13.5 Schematic of the VibroPiano setup. Figure reprinted from [10] 


The choice of suitable MIDI velocities required to analyze the Disklaviers’ 
dynamic range. The MIDI volume of the two Disklavier pianos was first set to approx- 
imate a linear response to MIDI velocity, according to Yamaha’s recommendations. 
The acoustic dynamic response to MIDI velocity was then measured by means of a 
KEMAR mannequin!’ (grand Disklavier) or a sound level meter (upright Disklavier) 
placed above the stool, approximately at the height of a pianist’s ears [11]. The loud- 
ness of a A4 tone was measured for ten, evenly spaced, values of MIDI velocity in the 
range 2-127. Each measurement was repeated several times and averaged. Results 
are reported in Table 13.2. In accordance with a previous study [15] that measured 
temporal and dynamic accuracy of computer-controlled grand pianos in reproducing 
MIDI control data, our results show a flattened dynamic response for high velocity 
values. Also, the upright model shows a narrower dynamic range, especially for low 
velocity values. 


!Ohttp://kemar.us/ (last accessed on Dec. 21, 2017). 
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Table 13.2 Sound level of a A4 tone, generated by the two Disklavier pianos for various MIDI 
velocities 


MIDI velocity Grand Disklavier (DC3-M4) Upright Disklavier (DU1A) 
(dB) (dB) 
2 47.8 73.3 
16 51.8 73.9 
30 60.0 74.6 
44 66.3 79.8 
58 72.4 84.5 
71 76.7 87.6 
85 80.1 90.7 
99 83.0 90.6 
113 85.1 91.6 
127 85.5 91.2 


Based on the above results, MIDI velocities 12, 23, 34, 45, 56, 67, 78, 89, 100, 111 
were selected for acquiring vibration recordings. This substantially covered the entire 
dynamic range of the pianos with evenly spaced velocity values. Extreme velocity 
values were excluded, as they result in flattened dynamics or unreliable response. For 
each of the selected velocity values, acceleration samples were recorded at the 88 keys 
of the two pianos. Recordings for each key/velocity combination lasted 16 seconds, 
thus amply describing the decay of vibration amplitude. Since the accelerometer 
was mounted on top of the measured keys, the initial part of the recorded samples 
represents the displacement of the keys being depressed by the actuation mechanism, 
until they hit the keybed and stop (see Fig. 4.4). Not being interested in kinesthetic 
components for the purpose of our research, these transients were manually removed 
from each of the samples, thus leaving only the purely vibratory part. 


Synthetic Vibration Samples 


A further set of vibration samples was instead synthesized, aiming at reproducing 
the same amplitude envelope of the real vibration signals while changing only their 
spectral content. Synthetic signals for each key and each of the selected velocity 
values were generated as follows. First, a white noise was bandlimited in the range 
20-500 Hz, covering the vibrotactile bandwidth [40] while being compatible with 
audio equipment.'! The bandlimited noise was then passed through a second-order 
resonant filter centered at the fundamental frequency of the note corresponding to the 
key. The resulting signal was modulated by the amplitude envelope of the matching 
vibration sample recorded on the grand piano, which in turn was estimated from 
the energy decay curve of the sample via the Schroeder integral [37]. Finally, the 


11n the low range, audio amplifiers are usually meant to treat signals down to 20 Hz. 
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power (RMS level) of the synthetic sample was equalized to that of the corresponding 
recorded sample. 


Vibration Sample Libraries 


The recorded and synthetic vibration samples sets were stored into the software sam- 
pler, which offers sample interpolation across MIDI velocities. Overall, three sam- 
ple libraries were created: two from recordings on the grand and upright Disklavier 
pianos, and one from the generated synthetic samples. 


13.3.2.3 Characterization and Calibration 


As suggested in the Chapter, to make sure that the piano prototype could accurately 
reproduce the designed audio and tactile feedback, it was subjected to a calibration 
procedure dealing with the following aspects: (i) auditory loudness; (ii) keyboard 
velocity response; (iii) amplitude and frequency response of vibrotactile feedback. 


Loudness Matching 


As a first step, the loudness of the piano synthesizer at the performer’s ear was 
matched to that of the Disklavier pianos. The piano synthesizer was set to simulate 
either a grand or an upright piano, to match the character of the reference Disklaviers. 
Measurements were taken with the KEMAR mannequin wearing earphones by hav- 
ing Pianoteq playback A notes on all octaves at the previously selected velocities. 
By using the volume mapping feature of Pianoteq—which allows one to set inde- 
pendently the volume of each key across the keyboard—the loudness of the piano 
synthesizer was then matched to the measurements taken on the Disklavier pianos 
as described in Sect. 13.3.2.2. 


Keyboard Velocity Calibration 


As expected, the keyboards of the Disklaviers and that of the Galileo digital piano 
have markedly different response dynamics due to their different mechanics and 
mass. Once the loudness of the piano synthesizer was set, the velocity response of 
the digital piano keyboard was matched to that of the Disklavier pianos. 

The keyboard response was adjusted via the velocity calibration routine included 
with Pianoteq, which was performed by an experienced pianist first on the Disklavier 
pianos—this time used as silent MIDI controllers driving Pianoteq—and then on the 
digital keyboard. Fairly different velocity maps were obtained. By making use of 
a MIDI data filter, each point of the digital keyboard velocity map was projected 
onto the corresponding point of the Disklavier velocity map. Two maps were there- 
fore created, one for each synthesizer-Disklavier pair (grand and upright models). 
The resulting key velocity transfer characteristics were then independently checked 
by two more pianists, to validate its reliability and neutrality. Such maps ensured 
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that, when a pianist played the digital keyboard at a desired dynamics, the generated 
auditory and tactile feedback were consistent with that of the corresponding Disklavier 
piano. 


Spectral Equalization 


As a final refinement, the vibratory frequency response of the setup was analyzed and 
then equalized for spectral flattening. Despite the optimized construction, spurious 
resonances were still present in the keyboard-plywood system, and additionally, the 
transducers’ frequency response exhibits a prominent notch around 300 Hz. 

The overall frequency response of the transduction-transmission chain was mea- 
sured in correspondence of all the A keys, leading to an average magnitude spec- 
trum that, once inverted, provided the spectral flattening equalization characteristics 
shown in Fig. 13.6. The 300 Hz notch of the transducers got compensated along with 
resonances and anti-resonances of the mechanical system. 

In order to prevent the generation of resonance peaks along the keyboard, the 
equalization curve was approximated using a software parametric equalizer in series 
with the software sampler that reproduced vibration signals. Focusing on the tactile 
bandwidth range, the approximation made use of a shelving filter providing a ramp 
climbing by 18 dB in the range 100-600 Hz, and a 2nd-order filter block approxi- 
mating the peak around 180 Hz. 

At the present stage, the VibroPiano has undergone informal evaluation by several 
pianists, who gave very positive feedback. Moreover, as described in Sect. 5.3.2.2, it 
has been used to test how different vibrotactile feedback (namely, realistic, realistic 
with increased intensity, synthetic, no feedback) may influence the user experience 
and perception of quality features such as control of dynamics, loudness, richness of 
tone, naturalness, engagement and general preference. 
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Fig. 13.6 Spectral flattening: average equalization curve. Figure reprinted from [10] 
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Fig. 13.7 The HSoundplane 


13.3.3 The HSoundplane 


The HSoundplane, shown in Fig. 13.7, is a multi-touch musical interface prototype 
offering multi-point, localized vibrotactile feedback. The main purpose of the inter- 
face is to provide an open and versatile framework allowing experimentation with 
different audio-tactile mappings, for testing the effectiveness of vibrotactile feedback 
in musical practice. 


13.3.3.1 Hardware Implementation 


Most current touchscreen technology still lacks finger pressure sensing!” and often 
do not offer satisfying response times for use in real-time musical performance. To 
overcome these issues, our prototype was developed based on the Madrona Labs 
Soundplane: an advanced musical controller, first described in [19] and now com- 
mercially available.'* The interface allows easy disassembly and is potentially open 
to hacking, which was required for our purpose. The Soundplane has a large multi- 
touch and pressure-sensitive surface based on ultra-fast patented capacitive sensing 
technology, offering tracking times in the order of a few ms, as opposed to the lag 
>50 ms of the current best touchscreen technology [8]. Its sensing layer uses several 
carrier antennas, each transporting an audio-rate signal at a different fixed frequency. 
Separated by a dielectric layer, transversal pickup antennas catch these signals, which 
are modulated by changes of thickness in the dielectric layer due to finger pressure on 


12With the exception of the recent Force Touch technology by Apple. 


13. ~ww.madronalabs.com (last accessed on Nov 29, 2017). 
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the Soundplane’s flexible surface. An internal DSP takes care of generating the carrier 
signals and decoding the touch-modulated signals for multiple fingers. The computed 
touch data (describing multi-finger positions and pressing forces) are sent to a host 
computer via USB connection. The Soundplane’s sensing technology requires the 
top surface and underlying layers to be as flat and uniform as possible. A software 
calibration routine is provided to compensate for minor irregularities. 

In the following of this section, we describe how the original Soundplane was aug- 
mented with vibrotactile feedback, resulting in the HSoundplane prototype (where 
‘H’ stands for ‘haptic’). 


Construction 


The original Soundplane’s multilayered design consists of a top tiled surface—a 
sandwich construction made of wood veneer stuck to a thin Plexiglas plate and a 
natural rubber foil—resting on top of the capacitive sensing layer described above. 
Since these components are simply laid upon each other and kept in place with 
pegs built into the wooden casing, it is quite simple to disassemble the structure and 
replace some of its elements. 

To implement a haptic layer for the Soundplane, we chose a solution based 
on low-cost piezoelectric elements: In addition to the advantages pointed out in 
Sect. 13.2, such devices are extremely thin (down to a few tenths of a millimeter) 
and allow scaling up due to their size and cheap price. The proposed solution makes 
use of piezo actuator disks arranged in a 30 x 5 matrix configuration matching the 
tiled pads on the Soundplane surface, so that each actuator corresponds to a tile 
(see Fig. 13.8). 

In order to maximize the vibration energy conveyed to the fingers, vibrotactile 
actuators should be ideally placed as close as possible to the touch location. The actu- 
ators layer was therefore placed between the top surface and the sensing components. 
However, such a solution poses some serious challenges: The original flexibility, flat- 
ness, and thickness of the layers above the sensing components have to be preserved 
as much as possible, so as to retain the sensitivity and calibration uniformity of the 
Soundplane’s sensor surface. To this end, the piezo elements were wired via an ad 
hoc designed flexible PCB foil with SMD soldering techniques and electrically con- 
ductive adhesive transfer tapes (3M 9703). The PCB with attached piezo elements 
was laid on top of an additional thin rubber sheet, with holes corresponding to each 
piezo element: This ensures enough free space to allow optimal mechanical deflec- 
tion of the actuators, and also improves the overall flexibility of the construction. 
However thin, the addition of the actuators layer alters the overall thickness of the 
hardware. For this reason, we had to redesign the original top surface replacing it 
with a thinner version. As a result, the thickness of the new top surface plus the 
actuators layer matches that of the original surface. Figure 13.9 shows an exploded 
view of the HSoundplane construction, consisting of a total of nine layers. 
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Fig. 13.8 Schematic of the actuators’ control electronics: a piezo actuators on flexible PCBs (sim- 
plified view); b slave PCBs with audio-to-haptic drivers and routing electronics; ¢ master controller. 
Notice: The Ist and 32nd channels are unused 


Electronics 


Based on off-the-shelf components, custom amplifying and routing electronics were 
designed to drive piezo elements with standard audio signals. 

In order to provide effective vibrotactile feedback at the HSoundplane’s surface, 
some key considerations were made. Driving piezo actuators require voltage values 
(in our case up to 200 V,,) that are not compatible with standard audio equipment. 
This, together with the large number of actuators used in the HSoundplane (150), 
poses a non-trivial electrical challenge. Being in the analog domain, the use of a 
separate audio signal for each actuator would be overkill. Therefore, we considered 
using a maximum of one channel per column of pads, reducing the requirements to 
30 separate audio channels. These are provided by a MADI system!'* formed by a 
RME MADIface USB"? hooked to aD.O.TEC ANDIAMO 2!° AD/DA converter. To 
comply with the electrical specifications of the piezo transducers, the analog audio 
signals produced by the MADI system—whose output sensitivity was set to 9 dBu 


'4Multichannel Audio Digital Interface: https://www.en.wikipedia.org/wiki/MADI (last accessed 
on Nov 29, 2017). 


'Shttps://www.rme-audio.de/en/products/madiface_usb.php (last accessed on Dec. 21, 2017). 
'©http://www.directout.eu/en/products/andiamo-2/ (last accessed on Dec. 21, 2017). 
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Fig. 13.9 Multilayered 
construction of the 
HSoundplane: a wooden 
case (new); b touch surface 
(wood veneer, 0.5 mm, new); 
c Plexiglas plate (1 mm, 
new); d natural rubber sheet 
(1.3 mm, new); e flexible 
PCB foil (0.3 mm, new); 

f piezo elements (0.2 mm, 
new); g natural rubber holed 
sheet (1.3 mm, new); 

h carrier antennas (original); 
i dielectric (original); 

j pickup antennas (original). 
Figure reprinted from [34] 


@ OdBrgs (reference 0.775 V),!” resulting in a maximum voltage of 2.18 V—must 
be amplified by about a factor 50 using a balanced signal. Routing continuous analog 
signals is also a delicate issue, since the end user must not notice any disturbance or 
delay in the feedback. 

To address all the issues pointed out above, a solution was designed based on three 
key integrated circuits components: (1) Texas Instruments DRV2667!® piezo drivers 
that can amplify standard audio signals up to 200 V,,; (2) serial-to-parallel shift 
registers with output latches of the 74HC595 family!’; (3) high-voltage MOSFET 
relays. For the sake of simplicity, the whole output stage of the HSoundplane was 
divided into four identical sections, represented in Fig. 13.8, each consisting of (a) a 
flexible PCB with 40 piezo actuators, connected by a flat cable to (b) a driver PCB 


'7For further details, see https://www.en.wikipedia.org/wiki/Line_level (last accessed on 
Nov 29, 2017). 
'Shttp://www.ti.com/product/drv2667 (last accessed on Dec. 21, 2017). 


!http://www.st.com/content/st_com/en/products/automotive-logic-ics/flipflop-registers/ 
m74hc595.html (last accessed on Dec. 21, 2017). 
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Fig. 13.10 Schematic of a slave driver board: a 8-channel audio input; b 8 piezo drivers; ¢ 40-point 
matrix of relays individually connected to each piezo actuator; d relay control; e microcontroller 
for initialization and synchronization. Figure reprinted from [34] 


with eight audio-to-haptic amplifiers and routing electronics. In order to address the 
wanted actuators and synchronize their switching with audio signals, (c) a master 
controller parses the control data generated at the host computer and routes them to 
the appropriate slave drivers. 

Figure 13.10 shows the detail of a slave driver board, which operates as follows: 
(a) Eight audio signals are routed to (b) the piezo drivers, where they are amplified 
to high voltage and sent to (c) a 8 x 5 relay matrix that connects to each of the piezo 
actuators in the section. This 40-point matrix is addressed by (d) a chain of serial-to- 
parallel shift registers commanded by (e) a microcontroller. On start-up, the micro- 
controller initializes the piezo drivers, setting among other things their amplification 
level. When in running mode, the slave microcontrollers receive routing informa- 
tion from the master, set a corresponding 40-bit word—each bit corresponding to 
one actuator—and send it to the shift registers, which individually open or close the 
relays of the matrix. As shown in Fig. 13.10, each amplified audio signal feeds five 
points in the relay matrix; therefore, each signal path is hard-coded to five addresses. 
Such fixed addressing is the main limitation of the current HSoundplane prototype: 
Each column of five actuators can only be fed with a single vibrotactile signal. 


13.3.3.2 Software Implementation 


The original Soundplane comes with a client application for Mac OS, which receives 
multi-touch data sensed by the interface and transmits them as OSC messages 
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according to an original format named ‘t3d’ (for touch-3d). The t3d data represent 
touch information for each contacting finger, reporting absolute x and y coordinates, 
and normal force along the z-axis. 

In the HSoundplane prototype, these data are used in real time to generate audio 
and vibration signals and route the latter to the piezo actuators located at the corre- 
sponding x- and y-coordinates. 


Relay Matrix Control 


Synchronization between vibration signals and the four relay matrices happens at 
the host computer level. While vibrotactile signals are output by the MADI system, 
control messages are sent to the master controller via USB. The master controller 
parses the received messages and consequently addresses the slave driver boards on 
a serial bus, setting the state of the relay matrices. 

The choice of using a master controller, rather than addressing each driver board 
directly, is motivated by the following observations: First, properly interfacing sev- 
eral external controllers with a host computer can be complex; second, the midterm 
perspective of developing the HSoundplane into a self-contained musical interface 
would eventually require to get rid of a controlling computer and work in closed loop. 
For that purpose, a main processing unit would be needed, which receives touch data, 
processes them, and generates vibrotactile information. 


Rendering of Vibrotactile Feedback 


Digital musical interfaces generally enable manifold mapping possibilities between 
the users’ gesture and audio output. In addition to what offered by common musi- 
cal interfaces, the HSoundplane provides vibrotactile feedback to the user, and this 
requires to define a further mapping strategy. Since the actuators layer is part of 
the interface itself, we decided to provide the users with a selection of predefined 
vibrotactile feedback mapping strategies. Sound mapping is freely definable as in the 
original Soundplane. Three alternative mapping and vibration generation strategies 
are implemented in the current prototype: 


1. Audio signals controlled by the HSoundplane are used to feed the actuators layer. 
Filtering is available to make the signal dynamics and frequency range comply 
with the response of the piezo actuators (see Sect. 13.3.3.3). This approach is 
straightforward and ensures coherence between the musical output and the tactile 
feedback. In a way, this first strategy mimics what occurs on acoustic musical 
instruments, where the source of vibration coincides with that of sound. 

2. Sine wave signals are used, filtered as explained above. Their frequency follows 
the fundamental of the played tones, and their amplitude is set according to the 
intensity of the applied forces. When the frequency of the sine wave signals 
overlaps with the frequency range of the actuators, this approach results in a clear 
vibrotactile response of the interface. 
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3. A simpler mapping makes use of a fixed frequency sine wave at 250 Hz for all 
actuators. This solution maximizes perceptual effectiveness by using a stimuli 
resulting in peak tactile sensitivity [39]. On the other hand, the produced vibro- 
tactile cues being independent from sound output, they may result in occasional 
perceptual mismatch between touch and audition. At the present time, this has 
still to be investigated. 


In a midterm perspective, the last two mapping strategies could be implemented 
as a completely self-contained system by relying on the waveform memory provided 
by the chosen piezo drivers model. 

Several other strategies for producing vibrotactile signals starting from the related 
audio are possible, some of which are described in Sect. 7.3. 


13.3.3.3 Characterization 


Vibration measurements were performed with the same setup described in 
Sect. 13.3.1.4. Initially, four types of piezo actuators with different specifications 
were selected, each with a different frequency of resonance and capacitance. Since 
each piezo driver has to feed five actuators in parallel, particular attention was paid 
to current consumption and heat dissipation. A piezo actuator Murata Electronics 
7BB-20-6”" was eventually selected, for it had the smallest capacitance value among 
the considered actuators, and therefore lower current needs. 

Once the piezo layer was finalized, vibrotactile cross talk was informally evalu- 
ated. Thanks to the holed rubber layer, which lets actuators vibrate while keeping 
them apart from each other, the HSoundplane is able to render localized vibrotactile 
feedback with unperceivable vibration spill at other locations, even when touching 
right next to the target feedback point. 

Vibration frequency response was measured in the vibrotactile range as follows: 
The accelerometer was stuck with double-sided tape at several pads of the top sur- 
face, and the underlying piezo transducers were fed with a sinusoidal sweep [9] 
between 20 and 1000 Hz, at different amplitudes. Making use of the sensitivity spec- 
ifications of the I/O chain, values of acceleration in m/s” and dB (re 10~° m/s”) were 
obtained from the digital amplitude values in dBps. Figure 13.11 shows the results 
of measurements performed in correspondence of four exemplary piezo transducers, 
for the maximum vibration level achievable without apparent distortion. Such sig- 
nals are well above the vibrotactile thresholds reported in Sect. 4.2 for active touch, 
effectively resulting in intense tactile sensation. In general, the frequency responses 
measured at different locations over the surface are very similar in shape, with a 
pronounced peak at about 40 Hz. In some cases, they show minor amplitude offsets 
(see, e.g., the response of piezo 102 in Fig. 13.11) that can be easily compensated for. 

Further measurements are planned in the time domain to test synchronization 
between audio signals and relay control, and to quantify closed-loop latency from 


20https://www.murata.com/products/productdetail?partno=7BB-20-6 (last accessed on Dec. 21, 
2017). 
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touch events to the onset of vibrotactile feedback. Also, similar to what was done for 
the Touch-Box (see Sect. 13.3.1.2), we plan to characterize finger pressing force as 
measured by the HSoundplane. 


13.4 Conclusions 


A few exemplary interfaces providing vibrotactile feedback were described, which 
have been recently developed by the authors for the purpose of conducting various 
perceptual experiments, and for musical applications. Details were given on the 
design process and on the technological solutions adopted for rendering accurate 
vibratory behavior. Measurements were performed to characterize the interfaces’ 
input (e.g., finger pressing force, or keyboard velocity) and output (vibratory cues). 

It is suggested that the characterization and validation of self-developed haptic 
devices is especially important when employing them in psychophysical experiments, 
as well as in evaluation and performance assessments (see the studies reported in 
Chap. 4, Sect. 5.3.2.2, and Chap.7). One the one hand, as opposed to relying on 
assumptions based on components’ specifications, characterization offers objective, 
verified data to designers and experimenters, respectively, enabling them to refine 
the developed devices and to better interpret experimental results. For instance, char- 
acterization data describing the actual nature of rendered haptic feedback may offer 
a better understanding of its perceived qualities. On the other hand, the character- 
ization of haptic prototypes—together with their technical documentation—allows 
reproducible implementations and enables other users and designers to carry on 
research and development, rather than resulting in one-of-a-kind devices. 


Acknowledgements The authors wish to thank Randy Jones, the inventor of the original Sound- 
plane, for providing technical support during the development the HSoundplane prototype, and 


280 S. Papetti et al. 


Andrea Ghirotto and Lorenzo Malavolta for their help in the preparation of the piano vibration 
samples. This research was pursued as part of project AHMI (Audio-Haptic modalities in Musical 
Interfaces, 2014-2016), funded by the Swiss National Science Foundation. 


References 


1. Askenfelt, A., Jansson, E.V.: From touch to string vibrations I: timing in the grand piano action. 
J. Acoust. Soc. Am. 88(1), 52-63 (1990) 

2. Bensmaia, S.J., Hollins, M.: The vibrations of texture. Somatosens. Mot. Res. 20(1), 33-43 
(2003) 

3. Brewster, S., Brown, L.M.: Tactons: structured tactile messages for non-visual information 
display. In: Proceedings of the Australas. User Interface Conference Dunedin, New Zealand 
(2004) 

4. Cadoz, C., Lisowski, L., Florens, J.L.: A modular feedback keyboard design. Comput. Music 
J. 14(2), 47-51 (1990) 

5. Choi, S., Kuchenbecker, K.J.: Vibrotactile display: perception, technology, and applications. 
Proc. IEEE 101(9), 2093-2104 (2013) 

6. Civolani, M., Fontana, F., Papetti, S.: Efficient acquisition of force data in interactive shoe 
designs. In: Nordahl, R., Serafin, S., Fontana, F., Brewster, S. (eds) Haptic and Audio Interaction 
Design (HAID). Lecture Notes in Computer Science (LNCS), vol. 6306, pp. 129-138. Springer, 
Berlin, Heidelberg (2010) 

7. Dahl, S., Bresin, R.: Is the player more influenced by the auditory than the tactile feedback 
from the instrument? In: Proceedings of the Digital Audio Effects Conference (DAFx), pp. 
6-9. Limerick, Ireland (2001) 

8. Deber, J., Araujo, B., Jota, R., Forlines, C., Leigh, D., Sanders, S., Wigdor, D.: Hammer time!: 
a low-cost, high precision, high accuracy tool to measure the latency of touchscreen devices. 
In: Proceedings of the CHI 16 Conference on Human Factors in Computing Systems, pp. 
2857-2868. ACM Press, San Jose, CA, USA (2016) 

9. Farina, A.: Advancements in impulse response measurements by sine sweeps. In: Proceedings 
of the Audio Engineering Society Conference vol. 122. AES, Vienna, Austria (2007) 

10. Fontana, F., Avanzini, F., Järveläinen, H., Papetti, S., Klauer, G., Malavolta, L.: Rendering and 
subjective evaluation of real versus synthetic vibrotactile cues on a digital piano keyboard. In: 
Proceedings of the Sound and Music Computing Conference (SMC), pp. 161-167. Maynooth, 
Ireland (2015) 

11. Fontana, F., Avanzini, F., Järveläinen, H., Papetti, S., Zanini, F., Zanini, V.: Perception of 
interactive vibrotactile cues on the acoustic grand and upright piano. In: Proceedings of the 
Joint International Computer Music Conference and Sound and Music Computing Conference 
(ICMC-SMC). Athens, Greece (2014) 

12. Fontana, F., Papetti, S., Civolani, M., dal Bello, V., Bank, B.: An exploration on the influence 
of vibrotactile cues during digital piano playing. In: Proceedings of the Sound and Music 
Computing Conference (SMC), pp. 273-278. Padua, Italy (2011) 

13. Galembo, A., Askenfelt, A.: Quality assessment of musical instruments—effects of multimodal- 
ity. In: Proceedings of the Sth Triennial Conference of the European Society for the Cognitive 
Sciences of Music (ESCOM). Hannover, Germany (2003) 

14. Giordano, M., Sinclair, S., Wanderley, M.M.: Bowing a vibration-enhanced force feedback 
device. In: Proceedings of the Conference on New Interfaces for Musical Expression (NIME). 
Ann Arbor, Michigan, USA (2012) 

15. Goebl, W., Bresin, R.: Measurement and reproduction accuracy of computer-controlled grand 
pianos. J. Acoust. Soc. Am. 114(4), 2273 (2003) 

16. Guizzo, E.: Keyboard maestro. IEEE Spect. 47(2), 32-33 (2010) 


13 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35, 


36. 


37. 


Implementation and Characterization of Vibrotactile Interfaces 281 


Jack, R.H., Stockman, T., McPherson, A.: Effect of latency on performer interaction and sub- 
jective quality assessment of a digital musical instrument. In: Proceedings of the Audio Mostly, 
pp. 116-123. ACM Press, New York, USA (2016) 

Järveläinen, H., Papetti, S., Schiesser, S., Grosshauser, T.: Audio-tactile feedback in musi- 
cal gesture primitives: finger pressing. In: Proceedings of the Sound and Music Computing 
Conference (SMC), pp. 109-114. Stockholm, Sweden (2013) 

Jones, R., Driessen, P., Schloss, A., Tzanetakis, G.: A force-sensitive surface for intimate 
control. In: Proceedings of the Conference on New Interfaces for Musical Expression (NIME). 
Pittsburgh, Pennsylvania, USA (2009) 

Lee, J., Choi, S.: Real-time perception-level translation from audio signals to vibrotactile 
effects. In: Proceedings of the CHI’ 13 Conference on Human Factors in Computing Systems, 
p. 2567. ACM Press, New York, USA (2013) 

Lylykangas, J., Surakka, V., Rantala, J., Raisamo, R.: Intuitiveness of vibrotactile speed regu- 
lation cues. ACM Trans. Appl. Percept. 10(4), 1-15 (2013) 

Maclean, K., Enriquez, M.: Perceptual design of haptic icons. In: Proceedings of the Eurohaptics 
Conference pp. 351-363. Dublin, Ireland (2003) 

Maeda, S., Griffin, M.J.: A comparison of vibrotactile thresholds on the finger obtained with 
different equipment. Ergonomics 37(8), 1391-1406 (1994) 

Marshall, M.T., Wanderley, M.M.: Vibrotactile feedback in digital musical instruments. In: 
Proceedings of the Conference on New Interfaces for Musical Express (NIME), pp. 226-229. 
Paris, France (2006) 

Massimino, M.J.: Improved force perception through sensory substitution. Control Eng. Pract. 
3(2), 215-222 (1995) 

McMahan, W., Romano, J.M., Abdul Rahuman, A.M., Kuchenbecker, K.J.: High frequency 
acceleration feedback significantly increases the realism of haptically rendered textured sur- 
faces. In: Proceedings of the IEEE Haptics Symposium, pp. 141-148. Waltham, Massachusetts, 
USA (2010) 

Mortimer, B.J.P., Zets, G.A., Cholewiak, R.W.: Vibrotactile transduction and transducers. J. 
Acoust. Soc. Am. 121(5), 2970-2977 (2007) 

Oboe, R., De Poli, G.: A multi-instrument force-feedback keyboard. Comput. Music J. 30(3), 
38-52 (2006) 

Okamoto, S., Konyo, M., Tadokoro, S.: Vibrotactile stimuli applied to finger pads as biases for 
perceived inertial and viscous loads. IEEE Trans. Haptics 4(4), 307-315 (2011) 

Okamura, A.M., Dennerlein, J.T., Howe, R.D.: Vibration feedback models for virtual envi- 
ronments. In: Proceedings of the IEEE International Conference on Robotics and Automation 
(ICRA) 1, pp. 674-679. Leuven, Belgium (1998) 

Pacchierotti, C.: Cutaneous Haptic Feedback in Robotic Teleoperation. Springer Series on 
Touch and Haptic Systems. Springer Int. Publishing, Cham (2015) 

Papetti, S., Fontana, F., Civolani, M., Berrezag, A., Hayward, V.: Audio-tactile display of 
ground properties using interactive shoes. In: Nordahl, R., Serafin, S., Fontana, F., Brewster, 
S. (eds.) Haptic and Audio Interaction Design (HAID). Lecture Notes in Computer Science 
(LNCS), 6306, pp. 117-128. Springer, Berlin, Heidelberg (2010) 

Papetti, S., Järveläinen, H., Giordano, B.L., Schiesser, S., Frohlich, M.: Vibrotactile sensitivity 
in active touch: effect of pressing force. IEEE Trans. Haptics 10(1), 113—122 (2017) 

Papetti, S., Schiesser, S., Fröhlich, M.: Multi-point vibrotactile feedback for an expressive 
musical interface. In: Proceedings of the Conference on New Interfaces for Musical Expression 
(NIME), Baton Rouge, LA, USA (2015) 

Prattichizzo, D., Pacchierotti, C., Rosati, G.: Cutaneous force feedback as a sensory subtraction 
technique in Haptics. IEEE Trans. Haptics 5(4), 1-13 (2012) 

Salisbury, C.M., Gillespie, R.B., Tan, H.Z., Barbagli, F., Salisbury, J.K.: What you can’t feel 
won’t hurt you: evaluating haptic hardware using a haptic contrast sensitivity function. IEEE 
Trans. Haptics 4(2), 134-146 (2011) 

Schroeder, M.R.: New method of measuring reverberation time. J. Acoust. Soc. Am. 37(6), 
1187-1188 (1965) 


282 


38. 


39. 
40. 


41. 


42. 


43. 


44. 


S. Papetti et al. 


Stepp, C.E., An, Q., Matsuoka, Y.: Repeated training with augmentative vibrotactile feedback 
increases object manipulation performance. PLoS One 7(2) (2012) 

Verrillo, R.T.: Vibration sensation in humans. Music Percept. 9(3), 281-302 (1992) 

Verrillo, T.: Vibrotactile thresholds measured at the finger. Percep. Psychophys. 9(4), 329-330 
(1971) 

Visell, Y., Giordano, B.L., Millet, G., Cooperstock, J.R.: Vibration influences haptic perception 
of surface compliance during walking. PLoS One 6(3), e17697 (2011) 

Yamaoka, M., Yamamoto, A., Higuchi, T.: Basic analysis of stickiness sensation for tactile 
displays. In: Ferre, M. (ed.) Haptics: Perception, Devices and Scenarios. Lecture Notes in 
Computer Science (LNCS), 5024, 427—436. Springer, Berlin Heidelberg (2008) 

Yao, H.Y., Hayward, V.: An experiment on length perception with a virtual rolling stone. In: 
Proceeding of the EuroHaptics Conference, pp. 275-278. Paris, France (2006) 

Yao, H.Y., Hayward, V.: Design and analysis of a recoil-type vibrotactile transducer. J. Acoust. 
Soc. Am. 128(2), 619-627 (2010) 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate 
credit to the original author(s) and the source, provide a link to the Creative Commons license and 
indicate if changes were made. 


The images or other third party material in this chapter are included in the chapter’s Creative 


Commons license, unless indicated otherwise in a credit line to the material. If material is not 
included in the chapter’s Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from 
the copyright holder. 


Glossary and Abbreviations 


Actuator A class of electromechanical transducers converting electrical signals 
into mechanical displacement. Often called vibrotactile actuators/transducers or 
tactors. In the context of the present volume, such devices are employed to convey 
vibratory cues to the user. 

Arduino An open-source microcontroller-based hardware and software platform 
suited to rapid prototyping. Its main purpose is to process input coming from 
various sensors and in turn generate control data (e.g. for creating interactive 
objects). 

Cutaneous Of the skin. 

Cut-off or Corner Frequency The frequency at which a filter attenuates a signal 
spectrum by 3 dB. 

DoF—Degrees of Freedom The number of independent parameters describing a 
mechanical system. 

DMI—Digital Musical Instrument A class of instruments composed of an inter- 
face capable of sensing users gestures and a sound generating unit, usually in the 
form of a digital synthesizer. These two independent components are connected 
by an arbitrary mapping layer. 

Digital Musical Interface A device controlling software or hardware for musical 
sound processing. Typical examples are MIDI controllers, such as keyboards. 
When gestures are mapped from the interface to a virtual musical instrument, a 
DMT is created. 

Enactive Attribute that refers to the cognitive process arising from the interaction 
between an acting subject and the environment. 

Exciter See actuator. 

Filter A generic tool for data or signal processing. For example, in the case of 
signal processing, low-pass or high-pass filters shape the frequency spectrum 
of a signal by respectively attenuating frequencies above or below their cutoff 
frequency, while band-pass filters attenuate frequencies below and above a certain 
range. With regard to data processing, MIDI filters are used to modify a MIDI 
data stream, e.g. by letting only certain messages pass through. 
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284 Glossary and Abbreviations 


Force Feedback Same as reactive force. See kinaesthetic feedback. 

JND—Just-Noticeable Difference A term commonly used in psychophysics to 
represent the amount by which a property of a physical stimulus (e.g. intensity, 
frequency) must be changed in order for a difference to be detectable by a person. 

Kinaesthetic Feedback Feedback targeting muscles and joints (as opposed to 
vibrotactile feedback which targets the skin). It can be conveyed, for example, 
through a force-feedback interface. 

Max or Max/MSP or Max/MSP/Jitter A commercial visual programming lan- 
guage and software environment for interactive multimedia computing, running 
on Mac and Windows operating systems. The MSP component addresses signal 
processing, while Jitter is for video and matrix computing. 

MIDI—Musical Instrument Digital Interface A technical standard defining a 
data (non-audio) communication protocol and electrical connectors for interfacing 
digital musical devices. A typical example is a MIDI keyboard sending note on/off 
and velocity (note dynamics) messages to a synthesizer. 

OSC—Open Sound Control A protocol for exchanging control data among 
musical devices and music software. OSC messages are transported across net- 
works (e.g. local or the Internet). OSC is sometimes used as an alternative to the 
older MIDI protocol, while the standard does not define a hardware interface. 

Pd—Pure Data An open-source visual programming language and software envi- 
ronment for interactive multimedia computing. Pd runs on a wide number of plat- 
forms, from Mac, Linux and Windows, to Android and iOS. It is in a way the free 
alternative to Max, with which it shares code and various components. 

Physical or Physics-Based Modelling Sound synthesis methods in which the 
generated sound is computed using a mathematical simulation of the acoustical 
behaviour of its source, usually a musical instrument. 

Proprioception The perception of one’s body and body parts position and move- 
ment, as conveyed by the somatosensory system. 

RMS— Root Mean Square A numerical value, usually expressed in dB, repre- 
senting the averaged power of a signal in a given time window. It is obtained by 
integrating the squared values of the signal in the same window, and subsequently 
extracting the square root. 

Sampler A musical tool, existing both in hardware and software forms, which 
generates sound from recorded audio samples. 

Sequencer A class of hardware or software tools for music (MIDI data and/or 
audio) recording, editing and playback. 

Shaker See actuator. Usually refers to large size and powerful actuators, employed 
to vibrate objects having a large mass (e.g. to convey whole-body vibration through 
a seat). 

Somatosensation A collective term for the sensations of touch, temperature, body 
and body parts position and movement (proprioception), and pain, which arise 
through cutaneous receptors, joints, tendons and other internal organs. 
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Tactor See actuator. 

Vibrotactile Relative to the perception of vibration through touch (vibrotaction). 

Virtual Musical Instrument A software simulation of a musical instrument (either 
existing or not) that generates sound in response to data input (e.g. MIDI or OSC). 
When coupled with a digital musical interface, a complete DMI is created. 


